Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove control queue #1210

Merged
merged 8 commits into from
Feb 27, 2024
Merged

Conversation

ianthomas23
Copy link
Collaborator

Currently when the control channel receives a message on the control stream, it puts it in a tornado.queues.Queue, and messages are popped off the queue one a time to be handled using process_control. I don't think the queue is necessary as the ZMQ socket/stream queues messages itself.

This PR removes the control queue, simplifying the code. The functions _flush_control_queue and poll_control_queue are removed as there is no queue to flush or poll. I assume the latter is considered public by its naming convention, but I don't know if any downstream libraries will be relying on its existence.

There are other queues used for the control channel and debugger that I have not looked at yet.

I have ensured that only one control message is handled at a time, to preserve the current behaviour. This uses a boolean flag in dispatch_control. This is a little strange as the use of ZMQStream.on_recv allows us to concurrently handle multiple control messages that we then have to suppress. There is future work to be done here to perhaps use the ZMQ socket rather than the stream, and jupyter-server/jupyter_server#1362 and jupyter/jupyter_client#997 are relevant.

tests/test_kernel.py Outdated Show resolved Hide resolved
async def dispatch_control(self, msg):
# Ensure only one control message is processed at a time
while self._block_control:
await asyncio.sleep(0)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like it could be implemented with an asyncio.Lock.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that looks like a much more elegant approach.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@ianthomas23
Copy link
Collaborator Author

The tests pass for this except for the downstream ipyparallel tests which are failing in various abort scenarios. It looks to me that ipyparallel doesn't use a separate control channel thread and to support that here I will need to add extra code (if the shell and control channels share a thread) to force the control messages to be processed before the shell ones. This will bring back some of the previous _flush_control_queue functionality, but in a different way.

ipykernel/kernelbase.py Outdated Show resolved Hide resolved
self.control_stream.flush()

socket = self.control_stream.socket
while socket.poll(1):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is blocking for 1ms, right? I think it should just check if there is a message:

Suggested change
while socket.poll(1):
while socket.poll(0):

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to, but if I do that the ipyparallel tests fail locally for me.

@ianthomas23
Copy link
Collaborator Author

To get test_debugger.py to pass the minimum dependencies CI run I needed to increase the minimum version of pyzmq from 24 to 25. In turn this required a bump of jupyter-client and tornado minimum versions (jupyter/jupyter_client#915). Summary of min version changes:

Dependency Previous min version Current min version Release date of current min version
jupyter-client 6.1.12 8.0.0 2023-01-26
pyzmq 24.0.0 25.0.0 2023-01-12
tornado 6.1.0 6.2.0 2022-07-03

I don't know if these changes will be considered too aggressive too soon.

@blink1073
Copy link
Contributor

blink1073 commented Feb 26, 2024

Hmm, perhaps instead I can cut a version of jupyter_client 6.x that supports pyzmq 25?

@blink1073
Copy link
Contributor

Or, perhaps we wait and fold this in to an ipykernel 7.x release with #1079?

@davidbrochart
Copy link
Collaborator

Or, perhaps we wait and fold this in to an ipykernel 7.x release with #1079?

I was thinking about that, but would this PR still make sense once #1079 is in?

@ianthomas23
Copy link
Collaborator Author

I was thinking about that, but would this PR still make sense once #1079 is in?

I see that #1079 still uses a queue for the control channel but it is no longer a Tornado queue of course. So this probably isn't needed after that, but the idea of simplifying by removing the queue might still be worth looking at.

If #1079 is near merging then we should ignore this for now. But it looks like there is quite a lot of work to do for Windows and downstream projects?

@davidbrochart
Copy link
Collaborator

Sorry Ian if your work in this PR ends up not being merged, I was aware of #1079 of course but Steve picked it up recently and I don't know if it's close to being merged, although he seems to be progressing fast.

@ianthomas23
Copy link
Collaborator Author

Sorry Ian if your work in this PR ends up not being merged, I was aware of #1079 of course but Steve picked it up recently and I don't know if it's close to being merged, although he seems to be progressing fast.

No problem. This has been my first attempt at a substantial change to ipykernel and even if not merged it has been a good learning experience to get used to the CI and in particular the impact on downstream projects.

@blink1073
Copy link
Contributor

blink1073 commented Feb 27, 2024

I had a go at the 6.x branch of jupyter_client, but I don't think it is revive-able. I think we can merge this and consider main to be 7.x, and focus on folding in #1079 as well.

@blink1073 blink1073 merged commit c1d944e into ipython:main Feb 27, 2024
32 checks passed
@ianthomas23 ianthomas23 deleted the remove_control_queue branch February 27, 2024 14:11
assert start > previous_end
previous_end = end

assert end >= start + timedelta(seconds=sleep)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fails downstream in IPython with:

TypeError: '>=' not supported between instances of 'str' and 'datetime.datetime'

Copy link
Collaborator Author

@ianthomas23 ianthomas23 Mar 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are the same symptoms that I observed at #1210 (comment). ipyparallel unfortunately monkey-patches jupyter_clients conversion of strings to datetimes so it is a no-op, and some versions of pytest-asyncio don't fully isolate the tests so sometimes some of these start and end dates come through as strings.

In this PR updating pytest-asyncio to 0.23.5 worked, but this is evidently not sufficient as the IPython failure already uses pytest-asyncio-0.23.5.

I think the sensible approach here is to assume that in this test sometimes, outside of our direct control, the dates are strings and convert them to datetimes. Otherwise this test is not testing what it is supposed to, it is testing the types of some fields returned in messages. I'll submit a PR here ASAP.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants