Wait until 0MQ sockets are created before starting R #43

lionel- · 2023-06-16T08:27:46Z

Using a one-off channel that is activated after all 0MQ sockets are created.

Addresses posit-dev/positron#720. I could no longer reproduce the crash using this branch.

jmcphers · 2023-06-16T16:30:00Z

crates/ark/src/shell.rs

    ) -> Self {
        let iopub_tx = iopub_tx.clone();
        spawn!("ark-r-main-thread", move || {
+            // Block until 0MQ is initialised before starting R to avoid
+            // thread-safety issues. See https://github.com/rstudio/positron/issues/720
+            let _ = conn_init_rx.recv().unwrap();


I think this should use recv_timeout() so it doesn't have the potential to hang indefinitely -- after some reasonable timeout we should either log an error and exit or go ahead and let R start with a warning (the latter seems better to me but your call!)

Done. I agree it's better to let R start with a warning in that case.

But since this message passing is entirely in-process a hang should never happen right? Is the idea to be defensive about future potential programming bugs? After many years of coding in C, I've learned to happily ignore impossible situations that would otherwise crash or hang the program 😅

Also we're using blocking recv() in other parts of the codebase, do you think we should generally be defensive like this?

Yes, it's just defensive programming. Since 0MQ is initialized on another thread there's all kinds of things that could happen (an exception? the initialize calls don't return or deadlock?).

Good question re: recv() ... In most places we use recv() on a threaded message loop wherein we wait for a message, process it, then wait for the next message. That pattern doesn't use a timeout b/c a long delay between messages is normal.

(that said, very possible that there are a few places that use recv() that should be recv_timeout())

That makes sense, thanks

lionel- requested a review from jmcphers June 16, 2023 08:45

Wait until 0MQ sockets are created before starting R

10749f3

lionel- force-pushed the bugfix/init-thread-safety branch from 583b58c to 10749f3 Compare June 16, 2023 09:16

jmcphers requested changes Jun 16, 2023

View reviewed changes

Wait with a timeout

3bc2c1a

jmcphers approved these changes Jun 16, 2023

View reviewed changes

lionel- merged commit 983c48a into main Jun 16, 2023

lionel- deleted the bugfix/init-thread-safety branch June 16, 2023 19:14

DavisVaughan mentioned this pull request Jun 20, 2023

ark: Weird crash when restarting R 4.1 repeatedly posit-dev/positron#720

Closed

lionel- mentioned this pull request Jun 29, 2023

ark: Should Ark be split in two processes? posit-dev/positron#802

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Wait until 0MQ sockets are created before starting R #43

Wait until 0MQ sockets are created before starting R #43

lionel- commented Jun 16, 2023

jmcphers Jun 16, 2023

lionel- Jun 16, 2023

jmcphers Jun 16, 2023

lionel- Jun 16, 2023

Wait until 0MQ sockets are created before starting R #43

Wait until 0MQ sockets are created before starting R #43

Conversation

lionel- commented Jun 16, 2023

jmcphers Jun 16, 2023

Choose a reason for hiding this comment

lionel- Jun 16, 2023

Choose a reason for hiding this comment

jmcphers Jun 16, 2023

Choose a reason for hiding this comment

lionel- Jun 16, 2023

Choose a reason for hiding this comment