-
Notifications
You must be signed in to change notification settings - Fork 245
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JSii blocks the node process when the JS code is not being called by the host #4133
Comments
Alright yes, the current implementation of the host in To a large extent this is because the host needs to maintain a specific order of operations, as the jsii protocol currently is sequential (request -> response to that request). If order of operations is breached, then the response sent could be for a different request than the last one sent by the client, who'd then be unable to match that response to a given request. I venture that we either need to build some state machines so that we can ensure that a given response is only sent when it's supposed to, so that the client can make sense of it all... or we need to re-design the interop model so that requests have IDs that are passed back in the response, so they could be sent out-of-order. Following this, we'd also need to make sure that multiple requests/responses don't get simultaneously sent in such a way that the JSON IPC stream would get corrupted (although maybe we're safe with that due to how there can only be 1 JS-executing thread at any given time) |
Right, yes. I think if you unblocked node, you could very quickly and naturally end up in a situation where the request responses would be out of order, because an unblocked async might want to call the host (this would definitely happen with my ContractCase project, which passes logs across the boundary so they can be idiomatically handled by the host's test framework).
With the caveat that I only have the context from the runtime architecture diagrams (and related documentation), this feels like the more reasonable of the two to me.
I agree for calls from node -> host, but I think you might need to take care not to multi-thread calls from host -> node (but you probably have that problem regardless of how the protocol works). If it turns out to be a problem, there are a few possible solutions there, I think. Sooo... this sounds like a substantial rework, and I might not be qualified to do it without more context of how JSii internals work. On the other hand, this issue kills my current approach for multi-platform dead -and JSii is otherwise such a perfect fit for my use-case. If a few hours work over a few PRs could help fix this, I'd like to help out. What do you think the best way to proceed is? |
I can take a look at spiking something over the weekend - could you point me at the communication code? I see this interface which looks like it's the place where node -> host happens. Is that right? How does that relate to this module? Also, presumably the other side of that communication boundary is implemented in each host language? How is the protocol tested? Would a good approach be to modify the node side, then follow breaking tests until it works? Any code pointers (or general advice) would be super helpful. Also sorry for so many questions! |
Actually, there might be a better way with less blast radius - If the node side of the communication boundary was controlled with a mutex, it could be possible to maintain the current contract between the hosts and the node side, but allow node to continue other unrelated promises. This would stop the core blocking the node thread, but wouldn't allow multi-threaded access across the JSii boundary (which I think is the current state anyway). |
Thanks for this awesome library! While investigating #4133 , I noticed a few typos in the comments. This PR corrects them. No behaviour changes are introduced. Two of the three comments are in exposed TS doc comments in `@jsii/kernel`. Let me know if you'd prefer a different commit type for those. --- By submitting this pull request, I confirm that my contribution is made under the terms of the [Apache 2.0 license]. [Apache 2.0 license]: https://www.apache.org/licenses/LICENSE-2.0
Describe the bug
In (at least) Java, the node process is blocked while it is waiting for new method calls.
Specifically, promises that are running in the JSii-ed node process don't run at all when the host process has no active calls to the JSii-ed process.
I have a small repro of this bug in this repository. Here's a breakdown of what's in that repo:
longRunningSomething()
has completed.*The console logs have a number in them that tell you which call of
longRunningSomething()
those logs were kicked off by.Although this is an
async
example, the blocking of the node process happens regardless of whether or not the call in to it is async. I only made an async example so that I could makelongRunningSomething()
take a while so you can see that the ticks run while node is doing something that it is planning to return to java.Expected Behavior
When the java code at that link is run, I expect the
(1)
ticks to continue after the call tolongRunningSomething()
while the Java thread is doing something else.Current Behavior
When the java code at that link is run, the
(1)
ticks stop once the JS returns to Java.The
(1)
ticks restart whenReproduction Steps
With JDK 20.0.1 (this is just the version I know works, others probably do too):
Observe that the ticks stop printing when Java is waiting
Possible Solution
My guess is that the stdin/out protocol for communicating between JSii and the host uses blocking calls on the JS side. I suspect that non-blocking calls would solve this problem, and possibly several others.
Because JSii communicates across a stdin/out protocol with another process, I think there's quite a bit of flexibility in the way that async is treated - one side could see it as async, while the other sees it as sync (and vice versa).
Additional Information/Context
This is related to this issue in that I think this is the cause, rather than aync hairiness.
(that issue describes the exact situation that I have in my real code, though)
This is a blocking issue for me, I would be happy to look in to it (but definitely would need some guidance on where to look)
SDK version used
jsii@5.1.1
Environment details (OS name and version, etc.)
node 18.13.0, JVM 20.0.1, MacOS 13.3.1 (a)
The text was updated successfully, but these errors were encountered: