Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding queue to prevent 'Bitcoin JSON-RPC: Work queue depth exceeded' errors #23

Closed
wants to merge 1 commit into from

Conversation

karelbilek
Copy link

@karelbilek karelbilek commented Mar 22, 2017

I have added a queue, fixing recurring "Bitcoin JSON-RPC: Work queue depth exceeded" errors.

The queue size can be set from calling code by options (I will add it to bitcore-node next); by default, 16 is used, since that's the default bitcoind queue size.

Fixes issues:

and maybe others

I am using the same async version as bitcore-node, so it's not installed twice with different versions :)

@karelbilek
Copy link
Author

karelbilek commented Mar 22, 2017

I have also fixed the tests (for some reason, they didn't go through; maybe other node version).

I am not sure what will Travis do with it. ....and he will fail. I will edit the tests so they go through both on my PC and on Travis then.

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.8%) to 99.16% when pulling 49af46f on runn1ng:master into da5d5ec on bitpay:master.

@karelbilek
Copy link
Author

New node in travis fixed the tests

@coveralls
Copy link

Coverage Status

Coverage remained the same at 100.0% when pulling 3a0b96a on runn1ng:master into da5d5ec on bitpay:master.

Copy link

@levino levino left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall a correct implementation of the feature. I personally would love the difff to be smaller / the changes to be more lightweight.

There are also some bigger issues as changing the node version which is a no go from my perspective. Needs to be addresses or at least discussed.

.travis.yml Outdated
@@ -1,6 +1,6 @@
language: node_js
node_js:
- '0.10'
- '7'
Copy link

@levino levino Mar 24, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be avoided. Changing the node versionin travis is a pretty major change. Why should this be necessary?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I run the tests on my PC, they failed (because node changed the error message on JSON parsing slightly), so I corrected the tests, but then they failed on travis. So I corrected Travis.

I don't think it's a major change, but I don't really care, it's just about different error in the tests. If this is a blocker, I can remove the two commits.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I will remove the test changes (this and the JSON line), I agree that it doesn't belong in this PR

lib/index.js Outdated
@@ -3,6 +3,8 @@
var http = require('http');
var https = require('https');

var async = require('async');
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line 5 is empty. Should not be the case. I would remove it.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah good idea

lib/index.js Outdated
@@ -22,6 +24,10 @@ function RpcClient(opts) {
this.log = RpcClient.loggers[RpcClient.config.logger || 'normal'];
}

var queueSize = opts.queue || 16;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Constant values (like 16) should be set at the top or in a config file like

var DEFAULT_CONCURRENCY=16
and then

var queueSize = opts.queue || DEFAULT_CONCURRENCY

Would be really annoying to go looking for that value.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

true. But it's not there for the other values also (user/pass), I didn't want to refactor too much... but yeah I can change that

lib/index.js Outdated
@@ -38,90 +44,100 @@ RpcClient.config = {
logger: 'normal' // none, normal, debug
};

function rpc(request, callback) {
function rpc(request, originalCallback) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see no reason to change this parameter name. Should not be done in my opinion.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well I can either change callback to originalCallback and then callback stays in the rest of the code (since I need to call both the original callback and the "task" callback).

Or I can leave callback, like you suggest, and then I will need to make newCallback and call newCallback everywhere instead of callback.

lib/index.js Outdated
for (var k in self.httpOptions) {
options[k] = self.httpOptions[k];
var task = function(taskCallback) {
var callback = function() {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about error handling? Can this still be done properly?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's done exactly as in the original code

lib/index.js Outdated

var called = false;
var called = false;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove all whitespace changes from the PR. This just bloats the diff and is very annoying when reviewing.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not unnecessary whitespace change. The code is in a new function that wasn't there before.

I can make two commits, one without the whitespace change and one that just shifts the lines, but is that really necessary.

"chai": "^1.10.0",
"coveralls": "^2.11.2",
"istanbul": "^0.3.5",
"mocha": "^2.1.0",
"sinon": "^1.12.2"
},
"dependencies": {
"async": "^1.3.0"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest use fixed deps or yarn. This right here will lead to undeterministic builds and heavy pain in the future. Async is a well maintained library, that is for sure but still they can f**k up a release. Better be safe than sorry.

So for the time being I would make the line so:

"async":"1.3.0"

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bitcore-node uses exactly this dependency, "async": "^1.3.0". Given that this is used only in bitcore-node, it doesn't matter really

@karelbilek
Copy link
Author

Thanks for the feedback. I agree with some, I will do some little changes

I can remove the changes in tests/travis if that's a stopper for merging, I don't really care (it was really only in order for my PC to run the tests, since I have a new node version; but I can change it back).

Anyway bitpay doesn't seem to reply on pull requests or merge them, so I don't know if it matters :)

@coveralls
Copy link

Coverage Status

Coverage decreased (-48.8%) to 51.24% when pulling f6154ce on runn1ng:master into da5d5ec on bitpay:master.

@coveralls
Copy link

Coverage Status

Coverage remained the same at 100.0% when pulling 6c319b5 on runn1ng:master into da5d5ec on bitpay:master.

@karelbilek
Copy link
Author

OK, I made the patch significantly smaller by making it a new function instead of wrapping it in a function-inside-function.

I kept the async version in package.json, since the ^version is used in the whole bitcore/insight

@levino
Copy link

levino commented Mar 24, 2017

@Runn1ng I was just reviewing as a bistander. I actually am not a maintainer of this codebase and cannot block or merge anything.

@karelbilek
Copy link
Author

Yeah I know. Collabolators got a little badge on github :)

karelbilek added a commit to trezor-graveyard/bitcore-node that referenced this pull request Mar 29, 2017
losh11 pushed a commit to litecore-archive/litecore-node that referenced this pull request May 11, 2017
@elichai
Copy link

elichai commented May 15, 2017

Why no one is merging it?

@coveralls
Copy link

Coverage Status

Coverage remained the same at 100.0% when pulling 6cb02d6 on runn1ng:master into da5d5ec on bitpay:master.

@coveralls
Copy link

Coverage Status

Coverage remained the same at 100.0% when pulling ee25142 on runn1ng:master into da5d5ec on bitpay:master.

@Berndinox
Copy link

... years later

@adamu
Copy link

adamu commented Nov 8, 2017

Why is this PR showing commit ee25142 instead of 6c319b5?

@karelbilek
Copy link
Author

Oh sorry, I probably pushed the smart fee commit to my branch to my repo by mistake. Thanks for noticing, I will force-push again

@gasteve
Copy link
Member

gasteve commented Jan 15, 2018

I'm not sure this is the right approach to dealing with this issue ... since bitcoind-rpc is used client side, and there can be many concurrent users of a given bitcoind node, the work queue could still be exceeded even with this logic in place. Isn't it better to handle such errors gracefully than to burden this library with logic that would only prevent the issue if a single process was exceeding the bitcoind work queue? Also, a wrapper of some sort that exists outside this repo could accomplish the same thing if it's really desired.

@carnesen
Copy link
Contributor

@gasteve I haven't looked into the specifics of the implementation in this PR, but it's common for a client library to provide a mechanism for limiting the number of concurrent requests. For example, https://github.com/brianc/node-postgres uses connection "pooling". It's true that for this to be effective there needs to be some level of coordination between clients and server (at least in terms of configuration) so that each client sets its concurrency appropriately. For example, if the server allows 16 concurrent tasks and there are four clients, then perhaps to be safe each client would self-limit to four concurrent tasks.

@levino
Copy link

levino commented Jan 16, 2018

@gasteve Valid point. But this is about fixing a dos vulnerabilty asap. As long as you have no PR that implements your suggested improvements, I would say: Done is better than perfect.

@levino
Copy link

levino commented Jan 16, 2018

But anyhow, this will not get merged because no maintainer gives a damn...

@levino
Copy link

levino commented Jan 16, 2018

Thinking a bit more about it, I see another issue: This is actually not fixing the dos vulnerability at all. An attacker can still flood the api with carefully designed requests so that the queue will fill up and no other user will be able successfully submit a single request (it will get queued at position 1 mio or so and only be resolved after 5 hours). In order to protect the insight-api against dos we need to queue "per user" on a much higher level.

Also it should be fine to get a "bitcoin rcp work queue depth exceeded" if it is handled gracefully further up in the stack instead of just being sent out straight to the user. One could retry or something.

So all in all I agree with the point that this here is the wrong place to try to fix the issue. Also the proposed changes are NOT fixing the issue at hand, which is: Anyone can take down an insight-api deployment with no cost.

@winteraz
Copy link

winteraz commented Jan 16, 2018

I'm trying to index (freshly) the data and I get this error. I didn't receive a single the API request(i.e. no chance I got DDOS-ed) Does this PR fixes freshly indexes as well?
I have a big server (64GB RAM , SSD etc) and just managed to index it at 6.17%.... this is really frustrating
[2018-01-16T10:11:11.098Z] error: RPCError: Bitcoin JSON-RPC: Work queue depth exceeded
[2018-01-16T10:11:26.100Z] error: RPCError: Bitcoin JSON-RPC: Work queue depth exceeded
[2018-01-16T10:11:41.101Z] error: RPCError: Bitcoin JSON-RPC: Work queue depth exceeded
[2018-01-16T10:12:56.109Z] error: RPCError: Bitcoin JSON-RPC: Work queue depth exceeded
[2018-01-16T10:13:11.109Z] error: RPCError: Bitcoin JSON-RPC: Work queue depth exceeded
[2018-01-16T10:13:26.110Z] error: RPCError: Bitcoin JSON-RPC: Work queue depth exceeded

@karelbilek
Copy link
Author

@levino yes this does not really solve the DDoS vulnerability.

However it stops "accidental" overflow of the RPC queue.

Bushstar pushed a commit to FeatherCoin/feathercore-node that referenced this pull request Jun 5, 2018
Bushstar pushed a commit to fiscalobject/ufocore-node that referenced this pull request Jun 11, 2018
eugene-sy added a commit to meritlabs/lightwallet-stack that referenced this pull request Jul 9, 2018
@levino
Copy link

levino commented Oct 31, 2018

Can someone close this? Thanks.

@matiu matiu closed this Oct 31, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants