-
Notifications
You must be signed in to change notification settings - Fork 295
How interactive debugging works (old way)
The Interactive Window is used for running cells in a python script. One of the capabilities of running these cells is to debug them.
This page will describe how this debugging is implemented under the covers.
This sequence diagram is explained below:
The first step is to inject debugpy into kernel. Without knowing what the process id of the kernel is, we need a way to have the kernel ready for attach. We do this by running this code in the kernel:
import debugpy
debugpy.listen(('localhost', 0))
That causes debugpy to start a server and returns the port to listen on. We use this port to generate a launch config. Something that looks like so:
{
"type": "python",
"request": "attach",
"connect": { "host": "localhost", "port": 5678 },
"justMyCode": true
},
Debugpy ships with the python extension. We add the path on disk where the python extension has debugpy before we attach. In a remote situation, debugging is disabled anyway, so we don't have to worry about supporting this for remote.
The launch config generated in the previous step is used to attach the debugger to the running debugpy server (much like done here for launching debugging of a python file).
VS code then transitions to debug mode. It just sits there waiting for an event from the debuggee.
The next step is called out as 'Replace kernel's run cell handler'.
What is that code doing? It's replacing the IPython runcell method with a new one so that we can set an environment variable BEFORE we run a cell.
Specifically this code here:
predicted_name = __VSCODE_compute_hash(args[1], args[0].execution_count)
os.environ["IPYKERNEL_CELL_NAME"] = predicted_name
Internally IPython uses the environment variable IPYKERNEL_CELL_NAME
to set the name of the pseudo file associated with any code that's run.
If we need an environment variable to be set when running a cell, why not just execute some code in the kernel? This diagram might explain why:
If we're using cells to change the environment variable, those cells themselves end up with the IPYKERNEL_CELL_NAME
set for them. In the example above, if CELL_2 calls into code in CELL_1, the debugger won't know that it was the original cell 1, and not the second "cell 1" where the variable was set.
To work around this problem, we instead patch the kernel so it sets the variable itself as it executes code.
As you can see in the python code above, the IPYKERNEL_CELL_NAME
is set to the hash of the cell contents plus the execution count. However in the interactive window, we're running code like so:
This means when the debugger returns a stack frame, its source member will be pointing to something like ipython_34343434343434.py
. This is obviously not the same as manualTestFile.py
which is where we want the IP indicator to be in VS code when the stop event fires.
Debugpy allows us to send a custom message to have it remap the ipython_34343434343434.py
into manualTestFile.py
. This means when the source locations in the stack frame responses come to VS code, it will just open the correct file.
This custom message is sent in the Remap source files
event in the sequence diagram.
So the debugger is attached and we have the file paths correct when execution happens, what is the enable thread tracing
step?
When debugpy is injected into a python process, it watches the execution of every line to see if it should be hitting breakpoints or not. This is really slow. This is okay if we're debugging a cell, but what about afterwards?
What if the user debugs one cell but then wants to run the kernel for a number of cells afterwards? We don't want the overhead of debugpy watching every line executing.
Debugpy has a global flag that essentially turns off this watching. That's what the enable thread tracing
step does. It enables the global flag for debugpy to start watching execution again.
It's not shown in the diagram above, but after debugging is complete, the flag is turned off.
Before we execute the cell from the user, we want the cell to stop on the first line in the cell. You might think, why not just send a breakpoint in the attach initialization (or before we enable tracing).
We didn't do that and instead just put a breakpoint()
instruction in the code that's about to be executed. The reason for this was to prevent VS code from knowing about the breakpoint and showing it in the UI when the response for the setBreakpoint was received.
Finally the cell is ready to execute. We execute it normally and because the first line has a breakpoint, it stops right after that point.
Once the stop event fires, debugging the cell is just like debugging any python code. Variable and stack frame requests are made. The user can step in and step out.
Then what happens after the user goes off the end of the cell? From debugpy's point of view, the process is just running non user code now, so it just keeps going until non user code is hit.
By disabling the thread tracing
flag, debugpy will stop listening and normal cells can be executed without debugpy breaking into them.
This is different than normal notebook debugging, where debugpy detaches from the kernel altogether.
As mentioned above, this isn't supported in remote. Why is that? Two reasons mainly:
- Debugpy is loaded from the python extension, we didn't want to sync it to the remote machine and load it into the remote kernel.
- Debugpy starts a debug server listening on a port. We didn't think users would want to open another port on their machine.
- Contribution
- Source Code Organization
- Coding Standards
- Profiling
- Coding Guidelines
- Component Governance
- Writing tests
- Kernels
- Intellisense
- Debugging
- IPyWidgets
- Extensibility
- Module Dependencies
- Errors thrown
- Jupyter API
- Variable fetching
- Import / Export
- React Webviews: Variable Viewer, Data Viewer, and Plot Viewer
- FAQ
- Kernel Crashes
- Jupyter issues in the Python Interactive Window or Notebook Editor
- Finding the code that is causing high CPU load in production
- How to install extensions from VSIX when using Remote VS Code
- How to connect to a jupyter server for running code in vscode.dev
- Jupyter Kernels and the Jupyter Extension