Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error while listing submodules #279

Closed
lungothrin opened this issue Apr 14, 2015 · 8 comments · Fixed by #818
Closed

Error while listing submodules #279

lungothrin opened this issue Apr 14, 2015 · 8 comments · Fixed by #818

Comments

@lungothrin
Copy link

GitPython has a problem while listing submodules.

As '.gitmodules' contains all the modules have ever been registered, it becomes problematic when some of them are removed. 'repo.submodules' will fail to work in this situation, though, 'repo.iter_submodules()' still works. But when 'repo' itself is a submodule, 'repo.iter_submodules()' does not yield any result.

@Byron Byron added this to the v1.0.1 - Fixes milestone Apr 14, 2015
@Byron
Copy link
Member

Byron commented Apr 14, 2015

Thanks for letting me know. Would you post code or any other means to help me reproducing this issue? This will make a fix so much easier.
Also: Which version of GitPython are you using ? It should be v0.3.6 or above.
Thank you

@lungothrin
Copy link
Author

GitPython v1.0.0

To reproduce it, just add a submodule, commit, then remove it, remove git directory is '.git/modules', too. keep the entry is '.gitmdoules'. commit again.

when the repository is a normal repository, 'repo.submodules' fail, 'repo.iter_submodules()' still works.
when the repository is a sbumodule in another repository, both will fail

from git import Repo

repo = Repo('path')
repo.submodules
for mod in repo.iter_submodules():
print mod.path

just some code like this will do the trick

@Byron
Copy link
Member

Byron commented Apr 14, 2015

Thank you !
It seems that 'iter_submodules()' is depending on the submodule's repository to exist. Therefore, if .git/modules is removed, this will not have the required information.
iter_submodules() will read information from .gitmodules only, and it doesn't care if the sub-module repositories are actually there.

The statements above are just my preliminary opinion, I don't know the code underneath good enough to know if this behaviour is good or not.

Can you also state which behaviour you would expect ?

@lungothrin
Copy link
Author

It seems that 'iter_submodules()' is depending on the submodule's repository to exist. Therefore, if > > .git/modules is removed, this will not have the required information.

'.gitmodules' may contains entries of already removed submodules, when someone clone it, these repository are naturally not exist in '.git/modules'. this when the problem occurs.
'iter_submodules()' should rely on content of '.git/config'. I believe this file is always up to date.

I encounter this problem on this repository https://github.com/lungothrin/android.git, It tremendous big.
I will try to create a smaller one and get back to you

@Byron
Copy link
Member

Byron commented Apr 14, 2015

Thanks for the clarification. GitPython currently entirely ignores the information in .git/config regarding submodules, as I consider it redundant. From your previous explanation I should be able to reproduce the issue and better handle this case, another example will help nonetheless even though it shouldn't strictly be required.

@lungothrin
Copy link
Author

clone this project you will get what you need to reproduce the problem. It is about 350MiB
run it against submodule 'external/chromium_org'

@Byron Byron modified the milestones: v1.0.1 - Fixes, v1.0.2 - Fixes Apr 22, 2015
@Byron Byron modified the milestones: v1.0.2 - Fixes, v1.0.3 - Fixes Feb 13, 2016
@Byron Byron modified the milestones: v2.0.0 - Features and Fixes, v2.0.1 - Bugfixes Apr 24, 2016
@nvie nvie modified the milestones: v2.0.4 - Bugfixes, v2.0.5 May 30, 2016
@Byron Byron modified the milestones: v2.0.9 - Bugfixes, v2.0.10 - Bugfixes, v2.1.0 - proper windows support, v2.1.0 - better windows support, v2.1.1 - Bugfixes Oct 16, 2016
@Byron Byron modified the milestones: v2.1.1 - Bugfixes, v2.1.2 - Bugfixes Dec 8, 2016
@Byron Byron modified the milestones: v2.1.2 - Bugfixes, v2.1.3 - Bugfixes Mar 8, 2017
@humitos
Copy link

humitos commented Jul 17, 2018

Hi! We are experimenting this issue in Read the Docs for some particular repositories (readthedocs/readthedocs.org#4371 (comment)).

I'd like to know if you plan to add this issue into your roadmap or not and when it could be fixed?

The proposed workaround using repo.iter_submodules() doesn't work for us. It fails with the same error when iterating over the generator returned.

The repo linked in the Read the Docs issue is one that can be used to reproduce this problem. Example:

git clone --depth 1 --recursive git@github.com:Chassis/Chassis.git
In [1]: import git

In [2]: repo = git.Repo('/tmp/Chassis')

In [3]: for s in repo.submodules:
   ...:     print(s.path)
   ...:     
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-3-52cfa97b49d1> in <module>()
----> 1 for s in repo.submodules:
      2     print(s.path)
      3 

~/.pyenv/versions/3.6.6/envs/readthedocs.org/lib/python3.6/site-packages/git/repo/base.py in submodules(self)
    322         :return: git.IterableList(Submodule, ...) of direct submodules
    323             available from the current head"""
--> 324         return Submodule.list_items(self)
    325 
    326     def submodule(self, name):

~/.pyenv/versions/3.6.6/envs/readthedocs.org/lib/python3.6/site-packages/git/util.py in list_items(cls, repo, *args, **kwargs)
    932         :return:list(Item,...) list of item instances"""
    933         out_list = IterableList(cls._id_attribute_)
--> 934         out_list.extend(cls.iter_items(repo, *args, **kwargs))
    935         return out_list
    936 

~/.pyenv/versions/3.6.6/envs/readthedocs.org/lib/python3.6/site-packages/git/objects/submodule/base.py in iter_items(cls, repo, parent_commit)
   1191 
   1192             # fill in remaining info - saves time as it doesn't have to be parsed again
-> 1193             sm._name = n
   1194             if pc != repo.commit():
   1195                 sm._parent_commit = pc

AttributeError: 'Tree' object has no attribute '_name'

In [4]: 

Thanks!

@Byron
Copy link
Member

Byron commented Jul 23, 2018

@humitos Thanks for posting here and adding additional weight! As we are in maintenance mode, all contributions are donated, but highly appreciated. Depending on the impact of the fix or feature, new releases are cut without delay.
Remembering the time submodules where implemented, they looked rather different (in git) and this is partially reflected in code which is more complex than it has to be.
However, the stack trace seems to indicate this one is more of an issue an broken assumption about the subodule object, which should be fixable with a little digging.
The question I would have when looking at the stacktrace is: Why does a 'tree' object end up in a place where a 'commit' is assumed? Maybe it's something that git changed, and all it would take is a small adjustment?

That said, would you find yourself capable of providing a fix?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging a pull request may close this issue.

4 participants