Break calls into two or three bytecodes for better specialization. #210
Replies: 5 comments 1 reply
-
As a motivating example, consider calls to a normal Python class (by "normal" I mean one that inherits from object and does not override My initial attempt to do this is both complex and broken, see python/cpython#30415 By splitting the Likewise, |
Beta Was this translation helpful? Give feedback.
-
Attempting to implement the above scheme, it seems like breaking the
The main benefit of merging Internally the interpreter would have several variables (probably in a struct) to describe the "shape" of the call. To outline how this would work, take the example
Cost of adding
|
Beta Was this translation helpful? Give feedback.
-
A bit more detail: There will be two For example
and
|
Beta Was this translation helpful? Give feedback.
-
bpo issue: https://bugs.python.org/issue46329 |
Beta Was this translation helpful? Give feedback.
-
We already do this partly with the
PRECALL_METHOD
instruction which is inserted before theCALL_[NO_]KW
to setup the correct argument offsets.I propose adding a
PRECALL_FUNCTION
instruction to be inserted for the non-method case.This would, in the non-specialized case, setup argument offsets for a non-method call.
The motivation for this change is to split the specialization of calls into two parts.
Currently we have a sort of
N*M
problem with specializing calls. We want to specialize for both the type of callable (builtin-function, Python function, bound-method, builtin-class, Python class) and for the argument handling (simple arguments, *args, **kwargs, defaults, etc.)By splitting the call sequence into two we need N+M specializations instead. Since N is at least 5 and M is at least 3, this saves a lot of duplication. Instructions specialized for both type and shape are larger than specializations for just one, so we end up with
(N*M*2)
vs.(N+M)
.The obvious downside of this is the increased memory use for code objects. The extra 2 bytes for the
PRECALL_FUNCTION
is not an issue, although the additional location information may be. The extra 8 or 16 bytes for the cache is significant, but should be tolerable.Beta Was this translation helpful? Give feedback.
All reactions