Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use global shared zmq.Context as default #549

Open
minrk opened this issue May 28, 2020 · 3 comments
Open

use global shared zmq.Context as default #549

minrk opened this issue May 28, 2020 · 3 comments

Comments

@minrk
Copy link
Member

minrk commented May 28, 2020

Follow-up from #548:

Using the global shared context (zmq.Context.instance()) by default is the standard way to use zmq and meant for multithreaded applications, but we are currently using a single Context per KernelManager to workaround some issues (#437). We should switch back to a shared global context as the default because it saves significant resources (avoids allocating IO threads and FDs for each kernel), but not before establishing robust testing to confirm that we aren't restoring the issues that were fixed by moving to 1:1 context:kernel. It is my hope that setting a minimum pyzmq version is enough to resolve such issues, but that may be optimistic.

@minrk minrk changed the title use global shared Context as default use global shared zmq.Context as default May 28, 2020
@MSeal
Copy link
Contributor

MSeal commented May 28, 2020

Thanks for pulling that topic out. I think the most important thing is to make a test plan to try it out. Some of the downstream libraries have parallelism tests (in nbclient and papermill) that should cover the basics parallelism patterns.

I think this list would cover patterns that have had issues in the past but we can expand if there's other edges to check:

  • Sequential executions within same process
  • Parallel executions with async on Windows
  • Parallel executions with async on OSX
  • Parallel executions with async on Linux
  • Parallel executions with threadpool on Windows
  • Parallel executions with threadpool on OSX
  • Parallel executions with threadpool on Linux
  • Parallel executions with forking (multiprocessing) on Windows
  • Parallel executions with forking (multiprocessing) on OSX
  • Parallel executions with forking (multiprocessing) on Linux
  • Parallel executions with nested forking (multiple process fork layers) on Windows
  • Parallel executions with nested forking (multiple process fork layers) on OSX
  • Parallel executions with nested forking (multiple process fork layers) on Linux

Some of these should be redundant tests but there were parallelism issues in pretty much every one of these boxes over the past year or two up and down the stack :/.

@minrk
Copy link
Member Author

minrk commented Jun 8, 2020

Thanks for listing scenarios! What constitutes a valid test? Just creation of sockets and sending a message?

@MSeal
Copy link
Contributor

MSeal commented Jun 8, 2020

Just creation of sockets and sending a message?

And teardown, yeah. I've used parallel papermill or nbclient processes over a simple notebook as a way to test in the past.

I thought of additional tests:

  • Launch a context, then launch a subprocess that also launches a context, then cleanup parent
  • Launch a context, then launch a subprocess that also launches a context, then cleanup child

This is the reason globals can hurt, as the os fork will share the global memory state and it may try to communicate to the wrong context or cleanup shared state outside the process. Good to note what the behavior here would be.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants