Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-88691: Shrink the CALL caches #103230

Merged
merged 8 commits into from
Apr 5, 2023
Merged

Conversation

brandtbucher
Copy link
Member

@brandtbucher brandtbucher commented Apr 3, 2023

Nothing fancy here, just recompute min_args in the CALL_PY_WITH_DEFAULTS implementation rather than caching it. It's not a particularly expensive thing to recompute, and is only used by this one particular flavor of call. So it probably isn't worth two bytes of cache space.

No perf impact.

@brandtbucher brandtbucher added performance Performance or resource usage interpreter-core (Objects, Python, Grammar, and Parser dirs) labels Apr 3, 2023
@brandtbucher brandtbucher requested a review from markshannon April 3, 2023 23:30
@brandtbucher brandtbucher self-assigned this Apr 3, 2023
@brandtbucher
Copy link
Member Author

Updating to get rid of the cached function version, too. We're only using this to confirm that the function is still "simple" and (in the case of CALL_PY_FUNCTION_WITH_DEFAULTS) that it has a tuple of default values. Like min_args, these conditions probably aren't expensive enough to justify the extra cache space.

As with the recent BINARY_SUBSCR_GETITEM changes, this actually makes the specializations for CALL_BOUND_METHOD_EXACT_ARGS, CALL_PY_EXACT_ARGS, and CALL_PY_WITH_DEFAULTS more flexible, since we don't care about the exact function version observed during specialization anymore.

Still no perf impact.

assert(PyTuple_CheckExact(func->func_defaults));
int defcount = (int)PyTuple_GET_SIZE(func->func_defaults);
assert(defcount <= code->co_argcount);
int min_args = code->co_argcount - defcount;
DEOPT_IF(argcount > code->co_argcount, CALL);
Copy link
Member

@markshannon markshannon Apr 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need these checks at all?
I suspect that I added them, but I'm not sure why.

If func->func_version == func_version then the only variable is argcount which is oparg + is_meth.
So we only need to know which of the two possible values of is_meth is safe.
Can precompute those at specialization, so we only need to check is_meth at runtime.

Copy link
Member Author

@brandtbucher brandtbucher Apr 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need these checks at all? I suspect that I added them, but I'm not sure why.

It looks like we just had space for it at the time (when cache sizes were fixed).

If func->func_version == func_version then the only variable is argcount which is oparg + is_meth. So we only need to know which of the two possible values of is_meth is safe. Can precompute those at specialization, so we only need to check is_meth at runtime.

The obvious way to do this would be to split this into two specializations: one that expects is_meth, and one that doesn't.

With that said, I'm not sure it's worth breaking up. There would be a decent amount of code duplication, and this opcode is relatively rare (~0.1% of all instructions executed). So maybe we should just leave it.

(Maybe if we really wanted to, we could use the low bit of the func_version cache entry.)

@@ -2450,7 +2450,7 @@ dummy_func(
}
DEOPT_IF(!PyFunction_Check(callable), CALL);
PyFunctionObject *func = (PyFunctionObject *)callable;
DEOPT_IF(func->func_version != func_version, CALL);
DEOPT_IF(!_PyFunction_IsSimple(func), CALL);
PyCodeObject *code = (PyCodeObject *)func->func_code;
DEOPT_IF(code->co_argcount != argcount, CALL);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we keep the version number check, we can convert this to a check on is_meth, which might be quicker as it saves reaching into the function object.

Copy link
Member

@markshannon markshannon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to keep the version number check as we will need the profiling information it provides for the tier2 optimizer.

There are some improvements we can make to CALL_PY_EXACT_ARGS and CALL_PY_WITH_DEFAULTS, but those are for a different PR.

@bedevere-bot
Copy link

When you're done making the requested changes, leave the comment: I have made the requested changes; please review again.

@brandtbucher
Copy link
Member Author

Makes sense. I'll revert my latest commit and land just the min_args removal.

@brandtbucher brandtbucher merged commit b4978ff into python:main Apr 5, 2023
gaogaotiantian pushed a commit to gaogaotiantian/cpython that referenced this pull request Apr 8, 2023
warsaw pushed a commit to warsaw/cpython that referenced this pull request Apr 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) performance Performance or resource usage
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants