-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Python 3.12 - ValueError: generator already executing #9138
Comments
Based on my very non-expert research, it seems that due to using As to how exactly the cpython commit that I bisected earlier plays into this, I have no clue. |
Thanks for the report. I've seen this from time to time. Do you have any sense of the severity of this issue? |
This is preventing us from upgrading to pylint 3, as well as switching our default python version to py3.12. This not only affects the CI for ansible/ansible, but content authors using our It will also cause issues with the ability for Fedora (or other distros switching to python3.12 as their default) to package ansible, due to Fedora 39 needing to include pylint 3 as a result of their inclusion of Python 3.12. As such, ansible-test installed from their packages will be partially unusable. |
Thanks for the extra details. I was wondering if the error was thrown only in object finalization after pylint's main program finishes. Obviously we need to clean it up either way, but this sounds more serious than the error I'd seen from time to time before. |
The exception output is interspersed in the normal output from pylint. So it doesn't seem necessarily to happen at the very end. I'm unsure exactly the internal timing of things, it could be in object finalization, but it doesn't seem to be after the main program finishes.
|
I mean the "easy" solution is just to override the default Python 3.12 appears to be much more aggressive about finalizing the (apparently) abandoned wrapped generators- I was able to proxy the ones that are involved in our repro to see when they're finalized (sadly
They aren't- it's happening inline, and much earlier with 3.12 than 3.11. I've also tried having the proxied generators capture the callstack on init, so I can examine the ones that appear to be leaked later to figure out who created them- |
(we're also going to try a throwaway plugin that disables the exception printing in |
I don't know if it's of any use, but here's a whittled-down version of that from ansible.plugins.loader import action_loader
class FieldAttributeBase:
def _load_module_defaults(self):
resolved_action = self._resolve_action()
if resolved_action:
validated_defaults_dict[resolved_action] = defaults
if True:
resolved_action = self._resolve_action()
if resolved_action:
validated_defaults_dict[resolved_action] = defaults
def _resolve_action(self):
prefer = action_loader.find_plugin_with_context()
if prefer.resolved:
return prefer.resolved_fqcn |
Yeah, at least improves the signal/noise ratio while beating on it- thanks! |
Here is an even smaller reproducer. It doesn't import anything else from the library, but it still has to be linted from inside the library (replacing I think an inference around # pylint: disable = line-too-long, missing-function-docstring, missing-module-docstring, missing-class-docstring, too-few-public-methods
# pylint: disable = no-member, using-constant-test, protected-access, attribute-defined-outside-init, undefined-variable
from importlib import import_module
class PluginLoadContext:
@property
def resolved_fqcn(self):
if True:
return None
return self._resolved_fqcn
class PluginLoader:
def find_plugin_with_context(self):
_ = import_module('asdf').__file__
if self._load_module_source():
self._module_cache[None] = None
self._load_config_defs()
self._display_plugin_load()
if self._plugin_instance_cache:
self._plugin_instance_cache[None] = self._plugin_instance_cache[None]
self._load_config_defs()
if self.undefined:
__import__()
PluginLoadContext()._resolved_fqcn = undefined
resolved_action = self._resolve_action()
if resolved_action:
{}[resolved_action] = None
return PluginLoadContext()
def _resolve_action(self):
prefer = PluginLoader().find_plugin_with_context()
if prefer.resolved:
return prefer.resolved_fqcn
return None After all the disables, Pylint gives this 10/10, even with the |
We've updated our CI to use a (hopefully temporary) plugin that installs a custom After spending the majority of the past several days digging pretty deeply into this (including hacking a bunch of instrumentation into CPython's Once I hacked in enough instrumentation to be able to roughly correlate the C-side generator wakeups to their associated Python frames, it appears that the issue is around a re-entrant C-side wakeup that's kicked off when a particular abandoned
Anyway, I need to get back to other work, but I might come back to it at some point- I just can't squash the feeling that there's a serious issue lurking here... |
Thanks for the deep dive.
I'm willing to try this, even if there's a python 3.12 issue that could be reported upstream. Specifically I think we can revert pylint-dev/astroid@0d4f73d. I'll see what the pylint primer tool says about that. |
We also got hit by this issue. Really annoying. We introduced another plugin for fixing this issue based on: Is there any chance to somehow fix this? Maybe even hacking with the solution provided by Ansible. Pylint is just unstable and useless out of the box for us with latest python. |
Spend some time bisecting cpython today to figure out when the error first appeared. This is what I've got so far
I haven't had time yet to look at these PRs in more detail. Wanted to post this here in case someone might be interested. This was the full error between Exception on node <Call l.21 at 0xffffadfd9910> in file '/usr/src/test.py'
Traceback (most recent call last):
File "/usr/astroid/astroid/decorators.py", line 90, in inner
yield next(generator)
^^^^^^^^^^^^^^^
StopIteration
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/src/pylint/utils/ast_walker.py", line 91, in walk
callback(astroid)
File "/usr/src/pylint/checkers/base/basic_checker.py", line 708, in visit_call
if utils.is_terminating_func(node):
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/src/pylint/checkers/utils.py", line 2165, in is_terminating_func
return True
^^^^
File "/usr/astroid/astroid/nodes/node_ng.py", line 170, in infer
for i, result in enumerate(self._infer(context=context, **kwargs)):
File "/usr/astroid/astroid/decorators.py", line 95, in inner
raise InferenceError(
astroid.exceptions.InferenceError: StopIteration raised without any error information. |
Just for fun I checked the That all leads me to believe that this might actually be a cpython issue in Maybe it does make sense to add the pylint plugin @matejsp suggested to overwrite |
Thanks for bisecting the CPython builds to track down possible sources- I hadn't had time to try that... Just at a glance, I'd be a little surprised if the dict materialization change actually fixed whatever the problem was, since it doesn't appear to be related to the generator bits at all. It's more likely that pylint/astroid are positively affected by that 3.13 change in other ways, so the timing/circumstances improve just enough that the known reproducers go away (just as they do when other changes around caching or whatever are made). I also wish there was a way to come up with a simpler standalone repro, but I've thus far failed to induce the problem outside of pylint/astroid on Ansible's codebase. 😞 |
Seems you're right. I just tested pylint with On For @nitzmahone I would be interested if you can reproduce that for |
Did some more bisecting today. At least for Home Assistant the change that does appear to fix all ValueErrors seems to be Also tested cherry-picking it onto 3.12 which did also work. |
I've now backported it to 3.12. |
Thanks a lot @iritkatriel! I've ran some more tests and the issue seems to be truly gone now. |
On the plus side, no issues running on 3.13.0a4 with Ansible's codebase (and our Cautiously optimistic that the CPython upstream fix takes care of this one everywhere- once 3.12.3+ is widely available, we'll kill off our masking plugin and hope that's the end of it. Thanks everyone! |
Thanks @nitzmahone 👍🏻 I went ahead and opened #9454 to include the |
This is an issue as of Python 3.12.5 again after python/cpython#120467. See logs. |
Bug description
It seems that with a combination of a commit to astroid, and a change in Python 3.12, I am getting a
ValueError: generator already executing
.Unfortunately, despite about 12 hours of trying, I don't have a simple reproducer, and this will involve running pylint against https://github.com/ansible/ansible
I've logged here instead of directly against astroid, as I was not able to find an easy way to cause the error with astroid alone.
Configuration
Command used
Pylint output
If I insert additional debugging into astroid, to use
traceback.print_stack()
when thatValueError
happens, I see something like:Traceback...
Expected behavior
No traceback
Pylint version
OS / Environment
N/A
Additional dependencies
The text was updated successfully, but these errors were encountered: