-
-
Notifications
You must be signed in to change notification settings - Fork 908
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Explanation makes unclear reference to "first" and "second" imports #1804
Comments
I agree, the comment suffered from 'rot' over time and should be adjusted. I definitely don't recall to have done anything smart about imports, and if so, it's apparently gone. |
To check if I was missing anything historical that should be specifically updated, I've just tried the However, in Python 2 there was a Reloading the
If this (but with the more commonly used |
This clarifies the comment that explains the significance of setting the Git.GIT_PYTHON_GIT_EXECUTABLE attribute when running it did not suceeed but it hadn't been set before (or was set to None). This happens in the code path in Git.refresh where has_git is False and old_git is None, provided the refresh mode doesn't call for the initial refresh to raise an exception on failure. It was previously described in terms of a "first" and "second" import, but as discussed in gitpython-developers#1804, the rationale and effect were not altogether clear.
This clarifies the comment that explains the significance of setting the Git.GIT_PYTHON_GIT_EXECUTABLE attribute when running Git failed but the attribute wasn't set before. This happens in the code path in Git.refresh where has_git is False and old_git is None, provided the refresh mode doesn't call for the initial refresh to raise an exception on failure. It was previously described in terms of a "first" and "second" import, but as discussed in gitpython-developers#1804, the rationale and effect were not altogether clear.
The
git.cmd.Git.refresh
class method includes this code:GitPython/git/cmd.py
Lines 478 to 481 in d28c20b
It is unclear what "the difference between a first import and a second import" is referring to. (I should have noticed this issue when working on #1725, in which I actually edited that comment to improve its formatting, capitalization, and punctuation, but I apparently did not notice it at that time.)
That code can run multiple times, accessing the same class-wide state each time, because
Git.refresh
can be called multiple times (and, being public, may reasonably be called by code outside of GitPython). But this is not related to how many imports, or import attempts, occur. The comment would make sense if Python's import system worked differently, but seems at odds with the actual design. As I understand it:If an import succeeds, the resulting module object is cached in
sys.modules
. Subsequent imports use the same module object and do not rerun the module's code.For example, suppose
hello.py
contains:Then running
import hello
from a REPL prints the greeting the first time it is run, but running it subsequent times produces no repeated side effect.If an import fails, the new module object, which may itself be incompletely populated, is removed from
sys.modules
, before control returns via the raised exception propagating up the call stack to the code that attempted the import. Even if that module object turns out still to exist, it will not be reused in subsequent attempts to import the code. A subsequent import attempt will rerun the module's code from scratch, creating a new module object which serves as a new namespace. The state in the old module object is not used; global variables of the module, including classes, and thus also including their attributes, are all new.For example, suppose
throw.py
contains:Then running
import throw
from a REPL raisesValueError
with a value of1
each time it is run, andsys.a
is populated with the abandoned module object from each attempt, all of which are separate objects. (Importingsys
in the REPL allows it to be inspected, sinceimport sys
succeeds the first time it occurs, whether that is in the REPL or inthrow.py
, and thus allimport sys
get the samesys
module object with the samea
attribute.)If the contents of
sys.modules
are explicitly modified (rather than implicitly, due to the effect ofimport
andfrom
statements), one of the above two situations still applies, depending on the nature of the mutation. The exception to this would be if a reference to the old module object is obtained and patched back intosys.modules
. That would be a very unusual thing to do, except perhaps for purposes of debugging or experimentation, and I'm reasonably sure the intent of that comment--and the code it documents--is not to cover such cases. Furthermore, that would still not rerun top-level code of the module.Python's import system is extensible and can be made to work differently from the usual rules. Of course, this cannot generally be anticipated by library code, and I likewise doubt this is a factor in the design or documentation.
I suspect that the code is correct and the comment incorrect, but since I'm not sure what was originally intended, I am not entirely sure how to fix it. It could be changed to say something like "discern the difference between the initial refresh at import time and subsequent calls," but this is very general and I worry it would paper over whatever really ought to be said. On the other hand, perhaps this really is all that was meant.
I haven't looked into the history of the Python import system but I don't think it operated in a fundamentally different way at the time this explanation was written. This code, and the original accompanying comment, was added in #640 (in 2b3e769). The reasoning given in that pull request, for the actual code change, continues to makes sense, and does not seem to have relied on any incorrect--nor any now-outdated--assumptions about the import system. As far as I can tell, it is only the comment that should be improved.
The text was updated successfully, but these errors were encountered: