Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wait until 0MQ sockets are created before starting R #43

Merged
merged 2 commits into from
Jun 16, 2023

Conversation

lionel-
Copy link
Contributor

@lionel- lionel- commented Jun 16, 2023

Using a one-off channel that is activated after all 0MQ sockets are created.

Addresses posit-dev/positron#720. I could no longer reproduce the crash using this branch.

@lionel- lionel- requested a review from jmcphers June 16, 2023 08:45
) -> Self {
let iopub_tx = iopub_tx.clone();
spawn!("ark-r-main-thread", move || {
// Block until 0MQ is initialised before starting R to avoid
// thread-safety issues. See https://github.com/rstudio/positron/issues/720
let _ = conn_init_rx.recv().unwrap();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should use recv_timeout() so it doesn't have the potential to hang indefinitely -- after some reasonable timeout we should either log an error and exit or go ahead and let R start with a warning (the latter seems better to me but your call!)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. I agree it's better to let R start with a warning in that case.

But since this message passing is entirely in-process a hang should never happen right? Is the idea to be defensive about future potential programming bugs? After many years of coding in C, I've learned to happily ignore impossible situations that would otherwise crash or hang the program 😅

Also we're using blocking recv() in other parts of the codebase, do you think we should generally be defensive like this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's just defensive programming. Since 0MQ is initialized on another thread there's all kinds of things that could happen (an exception? the initialize calls don't return or deadlock?).

Good question re: recv() ... In most places we use recv() on a threaded message loop wherein we wait for a message, process it, then wait for the next message. That pattern doesn't use a timeout b/c a long delay between messages is normal.

(that said, very possible that there are a few places that use recv() that should be recv_timeout())

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense, thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants