-
-
Notifications
You must be signed in to change notification settings - Fork 30.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Executors might ignore instrumentation. #109369
Comments
Because instrumentation can turned on in a finalizer and each With that in mind, we need to do the following:
|
Do I read between the lines that we are not proceeding with the plan to reimplement |
That is still the plan, but we don't want to be relying on it for correctness. Regardless of whether faster-cpython/ideas#582 will make that set of micro-ops smaller, meaning fewer guards and better optimizations. |
…idually and globally. (GH-110384)
… individually and globally. (pythonGH-110384)
… individually and globally. (pythonGH-110384)
Bug report
Code handed to the optimizer may not include instrumentation. If instrumentation is added later, the executor does not see it.
We remove all
ENTER_EXECUTORS
when instrumenting, but that doesn't fix the problem of executors that are still running.Example
A loop that calls
foo
:The loop gets turned into an executor, and sometime before
large_number
reaches zero, a debugger gets attached and in some callee offoo
turns on monitoring of calls. We would expect all subsequent calls tofoo()
to be monitored, but they will not be, as the executor knows nothing about the call tofoo()
being instrumented.Let's not rely on the executor/optimizer to handle this.
We could add a complex de-optimization strategy, so that executors are invalidated when instrumentation occurs, but that is the sort of thing we want to be doing for maximum performance, not for correctness.
It is much safer, and ultimately no slower (once we implement fancy de-optimization) to add extra checks for instrumentation, so that unoptimized traces are correct.
The solution
The solution is quite simple, add a check for instrumentation after every call.
We can make this less slow (and less simple) by combining the eval-breaker check and instrumentation check into one. This makes the check slightly more complex, but should speed up
RESUME
by less than it slows down every call as every call already has an eval-breaker check.Combining the two checks reducing the number of bits for versions from 64 to 24 (we need the other bits for eval-breaker, gc, async exceptions, etc). A 64 bit number never overflows (computers don't last long enough), but 24 bit numbers do.
This will have three main effects, beyond the hopefully very small performance impact for all code:
Making the check explicit in the bytecode.
We can simplify all the call instructions by removing the
CHECK_EVAL_BREAKER
check at the end and adding an explicitRESUME
instruction after everyCALL
Although this has the disadvantage of making the bytecode larger and adding dispatch overhead, it does have the following advantages:
RESUME
toRESUME_CHECK
which might be cancel out the additional dispatch overhead.Linked PRs
PyLong_Type
#111642The text was updated successfully, but these errors were encountered: