Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix git submodule detection if .git is not a file #687

Merged
merged 5 commits into from
Oct 24, 2023

Conversation

matrss
Copy link
Contributor

@matrss matrss commented Feb 16, 2023

The current heuristic to detect submodules, which checks if a directory contains a .git file, is too simple. In some cases .git might instead be a directory. This happens for example with datalad subdatasets or, according to the git submodules man page, with independently cloned repositories:

A repository that was cloned independently and later added as a submodule or old setups have the submodules git directory inside the submodule instead of embedded into the superprojects git directory.

This PR fixes this issue by getting a list of submodules from the .gitmodules file and comparing the paths with that instead.

@matrss matrss changed the title Fix git submodule detection Fix git submodule detection if .git is not a file Feb 16, 2023
@matrss matrss force-pushed the fix-git-submodules branch 7 times, most recently from 5378741 to 42ad7e4 Compare February 17, 2023 10:08
@matrss
Copy link
Contributor Author

matrss commented Feb 17, 2023

This should be finished now from my side, Windows was a little annoying since I do not have a machine to test locally.

@matrss
Copy link
Contributor Author

matrss commented Feb 22, 2023

One thing to note: this spawns quite a few git processes to parse .gitmodules (one for each directory). I did not notice any slowdown because of that so I did not bother optimizing it away, but we could cache the list of submodules if that is an issue.

@rharish101
Copy link

Will this PR be reviewed/merged soon? I've started using REUSE on all of my repos save for a few where REUSE can't work properly without this feature.

Copy link
Member

@carmenbianca carmenbianca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR @matrss! I've written down some remarks. I'll follow up by writing a patch to implement my suggestions.

src/reuse/project.py Outdated Show resolved Hide resolved
src/reuse/project.py Outdated Show resolved Hide resolved
src/reuse/project.py Outdated Show resolved Hide resolved
@carmenbianca carmenbianca force-pushed the fix-git-submodules branch 2 times, most recently from a1051f2 to 7e92f1e Compare April 7, 2023 15:18
Copy link
Member

@nicorikken nicorikken left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I reviewed it and resolved the AUTHORS.rst merge conflict by merging the main branch into the PR. If you prefer rebasing, feel free to undo that commit.

I reviewed the code and it makes sense to me. To me this is ready to merge.

The git config ... -get-regexp method is recommended on many places and has strong guarantees on the output as it is executed within the git program.

It leaves the option of adding a check for other VCS's too.

I checked it out locally, tested it in practice, ran the tests, and executed it on the coreboot repo, which I know contains a number of submodules:

coreboot $ git config -z --file .gitmodules --get-regexp "\.path"
submodule.3rdparty/blobs.path
3rdparty/blobs^@submodule.util/nvidia-cbootimage.path
util/nvidia/cbootimage^@submodule.vboot.path
3rdparty/vboot^@submodule.arm-trusted-firmware.path
3rdparty/arm-trusted-firmware^@submodule.3rdparty/chromeec.path
3rdparty/chromeec^@submodule.libhwbase.path
3rdparty/libhwbase^@submodule.libgfxinit.path
3rdparty/libgfxinit^@submodule.3rdparty/fsp.path
3rdparty/fsp^@submodule.opensbi.path
3rdparty/opensbi^@submodule.intel-microcode.path
3rdparty/intel-microcode^@submodule.3rdparty/ffs.path
3rdparty/ffs^@submodule.3rdparty/amd_blobs.path
3rdparty/amd_blobs^@submodule.3rdparty/cmocka.path
3rdparty/cmocka^@submodule.3rdparty/qc_blobs.path
3rdparty/qc_blobs^@submodule.3rdparty/intel-sec-tools.path
3rdparty/intel-sec-tools^@submodule.3rdparty/stm.path
3rdparty/stm^@submodule.util/goswid.path
util/goswid^@

Checking it with debug, I see it skips the submodules:

coreboot $ reuse --debug lint 2>&1 | grep submodule
reuse.project - DEBUG - yielding 'Documentation/getting_started/submodules.md'
reuse.project - INFO - ignoring '3rdparty/libhwbase' because it is a submodule
reuse.project - INFO - ignoring '3rdparty/ffs' because it is a submodule
reuse.project - INFO - ignoring '3rdparty/libgfxinit' because it is a submodule
reuse.project - INFO - ignoring '3rdparty/stm' because it is a submodule
reuse.project - INFO - ignoring '3rdparty/intel-sec-tools' because it is a submodule
reuse.project - INFO - ignoring '3rdparty/blobs' because it is a submodule
reuse.project - INFO - ignoring '3rdparty/qc_blobs' because it is a submodule
reuse.project - INFO - ignoring '3rdparty/cmocka' because it is a submodule
reuse.project - INFO - ignoring '3rdparty/opensbi' because it is a submodule
reuse.project - INFO - ignoring '3rdparty/fsp' because it is a submodule
reuse.project - INFO - ignoring '3rdparty/intel-microcode' because it is a submodule
reuse.project - INFO - ignoring '3rdparty/amd_blobs' because it is a submodule
reuse.project - INFO - ignoring '3rdparty/chromeec' because it is a submodule
reuse.project - INFO - ignoring '3rdparty/vboot' because it is a submodule
reuse.project - INFO - ignoring '3rdparty/arm-trusted-firmware' because it is a submodule
reuse.project - INFO - ignoring 'util/goswid' because it is a submodule
reuse.project - INFO - ignoring 'util/nvidia/cbootimage' because it is a submodule
reuse.project - DEBUG - yielding 'util/scripts/update_submodules'
reuse.project - DEBUG - searching 'Documentation/getting_started/submodules.md' for SPDX information
reuse.project - DEBUG - searching 'util/scripts/update_submodules' for SPDX information
* Documentation/getting_started/submodules.md
* util/scripts/update_submodules

Suggested improvements outside of this PR

@nicorikken
Copy link
Member

Looking into submodules in Mercurial (hg), I found an inception rabbithole of git-hg enabling Mercurial submodules in git, and Mercurial allowing so-called subrepositories of other types like SVN and Git. That might result in some edge-cases to be tested and covered in the future.

@mxmehl mxmehl requested a review from carmenbianca May 4, 2023 15:38
@linozen
Copy link
Member

linozen commented Jun 6, 2023

What's the status here @carmenbianca @nicorikken? Is this ready to merge?

@davidegrohmann
Copy link

Hi, is there an update on this?

@matrss
Copy link
Contributor Author

matrss commented Aug 29, 2023

I thought this was ready to merge at some point, after @carmenbianca addressed her comments herself. Is there anything missing from me?

matrss and others added 5 commits October 24, 2023 13:57
Git submodules do not always contain a .git file. Instead, it might also
be a directory. With this fix we parse .gitmodules to get a list of
paths which are submodules and compare to that, instead of the .git file
heuristic.
Parametrizes the submodule repository fixture to also create
repositories with manually added submodules, that is via git clone and a
manual entry in .gitmodules. The resulting submodule differs to that of
git submodule add in that it's .git path is a directory instead of a
file.
Signed-off-by: Carmen Bianca BAKKER <carmenbianca@fsfe.org>
Signed-off-by: Carmen Bianca BAKKER <carmenbianca@fsfe.org>
@carmenbianca
Copy link
Member

Finally merging. Sorry for the delay, and thank you for the PR! @matrss

@carmenbianca carmenbianca merged commit b12fba8 into fsfe:main Oct 24, 2023
21 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants