-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Python debugging support #3075
Python debugging support #3075
Conversation
/* private */ String[] CONFIGURED_INITIALIZATION_TYPES = | ||
Configuration.getInstance().getStringArrayFromProperty("thread.initialization"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this work if the property doesn't exist or is empty?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It fails if it does not exist, but succeeds if empty (with an empty array).
Due to how defaults work (or rather, don't work), I was under the impression we wanted to generally avoid using them, so dh-defaults.prop now contains this property.
// First call in to create a custom function that has the same name as the Java thread (plus a prefix) | ||
PyObject runnableResult = py_deephaven.create_thread_entry(Thread.currentThread().getName()); | ||
// runnable.run(); | ||
// Invoke that function directly from Java, so that we have only this one initial frame | ||
runnableResult.call("__call__", runnable); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we could do this with a single method. The naming would probably change:
py_deephaven.run_for_java_thread(Thread.currentThread().getName(), runnable)
or similar
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potentially, and that's how the first implementation worked. However, if we wanted the top-most frame to have a specific name (so that you can tell what Java thread something was running on by looking at the py stack trace, which doesn't come with thread info), we would need to create a custom def to return in the call - the debugger at least loses the __name__
and __qualname__
properties that are assigned to rename defs. As such, we may need to return a pyobject so that it can be called again.
And, if we just use the runnable.run() above (presently commented out), then we don't need to pass runnable at all, so the signature stays the same. In that case however it appears to the debugger that any pydevd.settrace() call in the web IDE actually pauses in the create_change_list
function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would this provide the same benefits?
def run_for_java_thread(name, runnable):
...
def JavaThread():
runnable.run()
JavaThread()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Changing the function's name (via declaring
JavaThread
not by name, but through eval) so the thread appears: technically yes, but by adding two frames instead of one, so the topmost frame isn't named in this way. - Adding one py frame earlier in the stack than the current code: technically yes, but again, because it adds two frames instead of one, we might be making it harder to understand instead of easier
Remember, these run once per thread, when the thread is created, so on the order of 10ish times on startup and once when the first python command is executed from the web console, so paying the extra jni/py round trip is pretty cheap.
// First call in to create a custom function that has the same name as the Java thread (plus a prefix) | ||
PyObject runnableResult = py_deephaven.create_thread_entry(Thread.currentThread().getName()); | ||
// runnable.run(); | ||
// Invoke that function directly from Java, so that we have only this one initial frame | ||
runnableResult.call("__call__", runnable); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This technique of ensuring the java frame includes python in the Runnable stack does rely on the fact that python drops the GIL when calling into java. jpy-consortium/jpy#48 nothing actionable, but good to note it wouldn't work otherwise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, but the fact that DHaaL works at all relies on that fact as well, the entire library would deadlock on the gil and ugp if we didn't make this assumption.
public interface ThreadInitializationFactory { | ||
/* private */ String[] CONFIGURED_INITIALIZATION_TYPES = | ||
Configuration.getInstance().getStringArrayFromProperty("thread.initialization"); | ||
/* private */ List<ThreadInitializationFactory> INITIALIZERS = Arrays.stream(CONFIGURED_INITIALIZATION_TYPES) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering, do we think there are occasions for more than 1 ThreadInitializationFactory?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have assumed for now that each factory should be created once, but it if wants per-thread state, it should do so by putting that state in the createInitializer
method itself, or in the returned Runnable. By only instantiating these once, we grant control to each implementation what kind of scope it wants to have.
For example, the current DebuggingInitializer
never closes the deephaven module (in its current form, it should), but could be modified to instead avoid paying the cost of re-creating that proxy for each thread (not just UGP threads, but also the DeephavenApiServerModule scheduler threads too).
static Runnable wrapRunnable(Runnable runnable) { | ||
Runnable acc = runnable; | ||
for (ThreadInitializationFactory INITIALIZER : INITIALIZERS) { | ||
acc = INITIALIZER.createInitializer(acc); | ||
} | ||
return acc; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to be sure I understand, the reason for this pattern is to allow for initializer-specific cleanup on exit? Otherwise, a pattern of iterative runnable invocations would seem less error-prone. I suppose either version allows arbitrary hijacking of the intended run method; the current pattern could skip calling the wrapped runnable, but the iterative approach could just never return or throw an exception on termination.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It isn't that we need to do cleanup on the way out per se, but that we want to insert a stack frame in python. If we end up dropping this frame (which has some weird side effects when you "step out" of the apparent top-level frame), then we could just run one after the other, no wrapping.
The current impl has its option of how it wants to do it - I played with an API that would offer a DSL for either wrapping or prefixing, but didn't really love it, I can try bringing that back if you'd prefer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, in light of the arm64 issues, a wrapper is necessary. The question then will be how we name that, if JavaThread is clear and simple enough, or if we want to use it to get some more info to the user.
engine/updategraph/src/main/java/io/deephaven/engine/updategraph/UpdateGraphProcessor.java
Outdated
Show resolved
Hide resolved
server/src/main/java/io/deephaven/server/runner/DeephavenApiServerModule.java
Outdated
Show resolved
Hide resolved
// python not enabled, don't accidentally start it | ||
return runnable; | ||
} | ||
DeephavenModule py_deephaven = (DeephavenModule) PyModule.importModule("deephaven") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note from @rcaudy : I haven't reviewed the bottom of this file past this point, or py/server/deephaven/__init__.py
yet.
1b5cc40
to
cb3bbb4
Compare
cb3bbb4
to
552bc50
Compare
eed53de
to
fa33665
Compare
7181b37
to
cb4076f
Compare
cb4076f
to
2a7b4b1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving for Python server codeowners, no interface code here.
This commit adds a configuration option to customize engine-created
threads, and provides a default implementation that will register those
threads for debugging with pydevd if python is enabled.
As of this commit, pydevd debugging seems to work correctly, but
VSCode's debugging doesn't work for all threads yet.
Partial #2997