Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

socket.pause() does not work at message level - is this by design? #139

Closed
MaxMotovilov opened this issue Apr 27, 2017 · 11 comments
Closed

Comments

@MaxMotovilov
Copy link

As a 0mq learning exercise, I started implementing missing node.js examples in the guide. Ran into trouble here - http://zguide.zeromq.org/page:all#Prototyping-the-Local-and-Cloud-Flows: the queueing logic relies on ability to selectively poll sockets. The pause()/resume() pair looked like it would do the trick so I tried it and discovered that the implementation uses a JS-only mode flag that is checked after the binding signals readiness but before dispatching the entire batch of messages. However it is not checked inside the loop iterating over the batch:

var cutoff = 1;
socket.on( 'message', _ => --cutoff == 0 && socket.pause() );

will not actually pause until the entire batch of available messages is dispatched blithely running cutoff into negative!

Questions:

  1. Is this a design decision or an oversight -- the documentation is very sparse?
  2. Is there a preferred alternative to socket pausing that will provide an ability akin to a selective poll?
  3. If 1=oversight and 2=no, will you accept a patch that changes the granularity of socket.pause() to the level of an individual message?
@MaxMotovilov
Copy link
Author

Furthermore, socket.pause() pauses the outbound messages as well as inbound, which makes it an even less suitable substitute to selective poll: no responses are sent to peer until we're ready to process more of his requests. My question 2 stands.

@interpretor
Copy link
Member

I have a very similar problem caused by the BatchList implementation: #152
So we could maybe think about alternative solutions as queuing mechanisms.

@MaxMotovilov
Copy link
Author

It would really help to get a response on design of this logic from the project owners... offhand I can't imagine why should a language binding do additional buffering outside of zmq itself, but it must have been done for a reason. I am wondering just how much production use with node.js has there been... certainly don't get a feel that node is a first-class citizen on zeromq.org. Which is really too bad given how well the paradigms fit together.

@interpretor
Copy link
Member

I came across another problem with the BatchList queue:
As node.js is single-threaded, on a big message burst, the whole process can block, because the BatchList is still processing the messages.
Wouldn't it be better to just spawn the zmq process and give it the messages he should send?

@MaxMotovilov
Copy link
Author

That sounds odd - if I understand it correctly, BatchList only pushes the messages to the zmq queue. The I/O reactor loop that talks to the socket is inside zmq and in a separate thread, so that should not be a bottleneck given additional CPU cores. If node.js thread is stuck just copying messages from BatchList to the queue, how can you possibly fix that? And doesn't it take more time to simply create that many messages? How big of a burst we are talking here?

I haven't really read through C++ part of the node.js binding, so I may well be missing the key issue...

@interpretor
Copy link
Member

interpretor commented Jun 14, 2017

I talk about 4000 messages per second on a quad-core armhf device.
I haven't examined the problem further, but if I burst send those messages, the node.js event queue is unable to listen for incoming messages.

The issue may be on another place, but it looks like something blocks the node.js thread.

@MaxMotovilov
Copy link
Author

MaxMotovilov commented Jun 14, 2017

Should be relatively easy to tell:

  1. Time your sending loop -- how big a fraction of the second does creating/sending 4K messages take.
  2. Run a setImmediate() queue guard to see if the queue turns over at all; I sometimes use something like:
var count = 0;
setInterval( function() {
   console.log( count/10, "events per second" );
   count = 0;
}, 10000 );

queueGuard();

function queueGuard() {
   ++count; 
   setImmediate( queueGuard );
}

rather crude and plenty of overhead, but it should definitely tell you how bad the situation truly is (worst case: you get no console messages at all). If your queue is indeed stuck but creating/sending 4K messages doesn't take most of the second then yes, something else is going on...

@interpretor
Copy link
Member

Thanks for the hint, I'll try it when I have time.

@interpretor
Copy link
Member

@rgbkrk @lgeiger do you have any clue why there is the BatchList queue, and why the messages aren't passed directly to the binary?

@rgbkrk
Copy link
Member

rgbkrk commented Jun 19, 2017

The original authors for those sections are not around anymore. The primary reason I got involved was to fix builds on all the platforms and create the prebuilts for use across Electron and node. There are a variety of friction points in the current API and implementation that I'd love to see fixed up, even if it means a major release. @ronkorving was trying to do an overall refactor in JustinTulloss/zeromq.node#516 while @minrk was also hoping to fix some of the API up here.

@rolftimmermans
Copy link
Member

The latest 6.0 beta release has a new API that addresses some fundamental issues with the previous API (including the one described here). See #189 for more reasoning behind the new API.

If you run into any problems feel free to report it here or in a new issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants