Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

git submodules don’t work #61

Closed
hynek opened this issue Dec 9, 2015 · 16 comments
Closed

git submodules don’t work #61

hynek opened this issue Dec 9, 2015 · 16 comments
Labels

Comments

@hynek
Copy link

hynek commented Dec 9, 2015

Currently if I have a git submodule, check-manifest complains about it missing in sdist and VCS at once: https://travis-ci.org/hynek/argon2_cffi/jobs/95785495

@mgedmin
Copy link
Owner

mgedmin commented Dec 9, 2015

This is because check-manifest uses git ls-files to discover what's under version control, and apparently git ls-files doesn't recurse into submodules.

Steps to reproduce:

  • git clone --recursive https://github.com/hynek/argon2_cffi (I tested with commit c8d9535)
  • check-manifest argon2_cffi

@mgedmin mgedmin added the bug label Dec 9, 2015
@mgedmin
Copy link
Owner

mgedmin commented Dec 9, 2015

Plan:

  • run git ls-files as usual
  • figure out what the submodules are by running git submodule --quiet foreach --recursive 'echo $path'
  • for each submodule run git ls-files in that subdirectory
  • append the correct submodule path prefix to all the files retunred in each git ls-files invocation
  • concatenate all the lists
  • profit!

@mgedmin mgedmin closed this as completed in effec6c Dec 9, 2015
@mgedmin
Copy link
Owner

mgedmin commented Dec 9, 2015

With current git master I get the following output if I try with argon2_cffi:

lists of files in version control and sdist do not match!
missing from VCS:
  libargon2/.git

Do you really want your source tarballs to contain a .git file with gitdir: ../.git/modules/libargon2 as the content? If not, you may want to add

exclude src/argon2/_ffi.py .git

to your MANIFEST.in.

@hynek
Copy link
Author

hynek commented Dec 9, 2015

I can confirm that the current master works if I exclude the .git.

Release? 🐶

@mgedmin
Copy link
Owner

mgedmin commented Dec 9, 2015

I will release 0.30 when my Jenkins confirms that the code works on Windows.

This is taking longer than I expected for reasons I don't understand. Previously builds would finish in 10 minutes. This build is taking 53 minutes and counting (and it has test failures I already fixed, which will be tested in a new build).

I rdesktop'ed manually to my Jenkins machine and ran the tests using nose, while Jenkins was still busy on the same machine, and my manual test run took 35 seconds for one Python version. I don't understand what is happening, and I lack my usual tools (pstree, strace) on the unfamiliar Windows platform to debug it.

In the meantime I'm exploring Appveyor.

@mgedmin
Copy link
Owner

mgedmin commented Dec 9, 2015

Somehow the new code produces a duplicate directory name on Windows:
https://ci.appveyor.com/project/mgedmin/check-manifest/build/1.0.4/job/ag3ku5s5qsmq22gq

EDIT: because I used posixpath.join instead of os.path.join, for bad reasons.

@mgedmin
Copy link
Owner

mgedmin commented Dec 9, 2015

Also, any idea why pip install bzr doesn't place a bzr executable on my %PATH%?

EDIT: still dunno why, but conda install bzr and a manual %PATH% update works fine.

(Can't pip install bzr on Python 3.x anyway.)

@mgedmin
Copy link
Owner

mgedmin commented Dec 9, 2015

Down to one error on Appveyor, which is the same error I get when I run the tests manually on the Jenkins machine -- but not when Jenkins runs the tests for me.

@mgedmin
Copy link
Owner

mgedmin commented Dec 9, 2015

At this point I have tests passing when I run .tox/py27/scripts/nosetests, and failing when I run tox -e py27. But there are no locale environment variables on Windows! ... are there?

EDIT: sys.stdout.encoding is None when tox redirects the output of nosetests into a log file. Duh.

Also, I seriously question my life choices.

@hynek
Copy link
Author

hynek commented Dec 9, 2015

welcome to my life. :)

if it makes you feel any better, I fought (and lost) Windows too today

@mgedmin
Copy link
Owner

mgedmin commented Dec 10, 2015

Yay: I got the tests to pass on Appveyor and manually.

Boo: the tests now fail on Jenkins.

Do I even care?

@mgedmin
Copy link
Owner

mgedmin commented Dec 10, 2015

check-manifest 0.30 is out on PyPI.

hynek added a commit to hynek/argon2-cffi that referenced this issue Dec 10, 2015
...because @mgedmin makes bad life's decisions and is awesome.

ref mgedmin/check-manifest#61
@lorengordon
Copy link
Contributor

lorengordon commented Aug 26, 2016

@mgedmin, I'm still seeing duplicate directory names on Windows, using check-manifest 0.32 (and building from source). I added this to get past it temporarily, but obviously not fixing the root cause. Any ideas?

# git diff
diff --git a/check_manifest.py b/check_manifest.py
index 948cd7c..33842ac 100755
--- a/check_manifest.py
+++ b/check_manifest.py
@@ -194,7 +194,10 @@ def copy_files(filelist, destdir):
         if not os.path.isdir(destfiledir):
             os.makedirs(destfiledir)
         if os.path.isdir(filename):
-            os.mkdir(destfile)
+            try:
+                os.mkdir(destfile)
+            except WindowsError:
+                pass
         else:
             shutil.copy2(filename, destfile)

@lorengordon
Copy link
Contributor

This diff starts to expose the underlying problem. The same directory is sometimes returned twice by vcs.get_versioned_files(), once with forward slashes and once with backslashes.

diff --git a/check_manifest.py b/check_manifest.py
index 948cd7c..90dd9b5 100755
--- a/check_manifest.py
+++ b/check_manifest.py
@@ -30,6 +30,7 @@ import zipfile
 from contextlib import contextmanager, closing
 from distutils.text_file import TextFile
 from xml.etree import cElementTree as ET
+from pprint import pprint

 try:
     import ConfigParser
@@ -194,7 +195,10 @@ def copy_files(filelist, destdir):
         if not os.path.isdir(destfiledir):
             os.makedirs(destfiledir)
         if os.path.isdir(filename):
-            os.mkdir(destfile)
+            try:
+                os.mkdir(destfile)
+            except WindowsError:
+                pass
         else:
             shutil.copy2(filename, destfile)

@@ -433,6 +437,10 @@ def detect_vcs():
 def get_vcs_files():
     """List all files under version control in the current directory."""
     vcs = detect_vcs()
+    print("\n\nPrinting sorted get_versioned_files\n\n")
+    pprint(sorted(vcs.get_versioned_files()))
+    print("\n\nPrinting sorted normalize_names\n\n")
+    pprint(sorted(normalize_names(vcs.get_versioned_files())))
     return normalize_names(vcs.get_versioned_files())

@lorengordon
Copy link
Contributor

lorengordon commented Aug 26, 2016

Ok, think I got it. _git_ls_files() returns file paths with forward slashes. For files in submodules, add_prefix_to_each() is called, which uses os.path.join() and the relpath os.path.relpath() and ends up creating a file path with backslashes. Since not every file is in a submodule, the list of files ends up with some files that use forward slashes and others that use backslashes. This is probably only a problem for us because we're packaging submodules in a static directory within our src/project directory, so both our src and the submodules share a bit of the path. E.g.

<project_root>/src/project/__init__.py
<project_root>/src/project/<modules>/
<project_root>/src/project/static/<submodules>

If it were more like this, there wouldn't be any path overlap and we wouldn't have seen the issue:

<project_root>/src/project/__init__.py
<project_root>/src/project/<modules>/
<project_root>/static/<submodules>

lorengordon added a commit to lorengordon/check-manifest that referenced this issue Aug 26, 2016
`_git_ls_files()` returns file paths with forward slashes. For files
in submodules, add_prefix_to_each() is called to prefix all the
submodule files with the relative path to the project root. On Windows,
the relative path contains backslashes. Since not every file in the
project is in a submodule, the combined list of files ended up with
some files that use forward slashes and others that use backslashes.

For project structures where the submodules share a path segment with
the project source, this caused `add_directories()` to create duplicate
directory entries in the list of files. The duplicate entries would
result in a traceback with a WindowsError when `os.mkdir()` was called
to create the same directory a second time.

See mgedmin#61 for more details.

This commit replaces the backslashes with forward slashes in the
relative path, eliminating the duplicate directory entries.
lorengordon added a commit to lorengordon/check-manifest that referenced this issue Aug 26, 2016
`_git_ls_files()` returns file paths with forward slashes. For files
in submodules, add_prefix_to_each() is called to prefix all the
submodule files with the relative path to the project root. On Windows,
the relative path contains backslashes. Since not every file in the
project is in a submodule, the combined list of files ended up with
some files that use forward slashes and others that use backslashes.

For project structures where the submodules share a path segment with
the project source, this caused `add_directories()` to create duplicate
directory entries in the list of files. The duplicate entries would
result in a traceback with a WindowsError when `os.mkdir()` was called
to create the same directory a second time.

See mgedmin#61 for more details.

This commit replaces the backslashes with forward slashes in the
relative path, eliminating the duplicate directory entries.
lorengordon added a commit to lorengordon/check-manifest that referenced this issue Aug 26, 2016
`_git_ls_files()` returns file paths with forward slashes. For files
in submodules, add_prefix_to_each() is called to prefix all the
submodule files with the relative path to the project root. On Windows,
the relative path contains backslashes. Since not every file in the
project is in a submodule, the combined list of files ended up with
some files that use forward slashes and others that use backslashes.

For project structures where the submodules share a path segment with
the project source, this caused `add_directories()` to create duplicate
directory entries in the list of files. The duplicate entries would
result in a traceback with a WindowsError when `os.mkdir()` was called
to create the same directory a second time.

See mgedmin#61 for more details.

This commit replaces the backslashes with forward slashes in the
relative path, eliminating the duplicate directory entries.
lorengordon added a commit to lorengordon/check-manifest that referenced this issue Aug 26, 2016
`_git_ls_files()` returns file paths with forward slashes. For files
in submodules, add_prefix_to_each() is called to prefix all the
submodule files with the relative path to the project root. On Windows,
the relative path contains backslashes. Since not every file in the
project is in a submodule, the combined list of files ended up with
some files that use forward slashes and others that use backslashes.

For project structures where the submodules share a path segment with
the project source, this caused `add_directories()` to create duplicate
directory entries in the list of files. The duplicate entries would
result in a traceback with a WindowsError when `os.mkdir()` was called
to create the same directory a second time.

See mgedmin#61 for more details.

This commit replaces the backslashes with forward slashes in the
relative path, eliminating the duplicate directory entries.
lorengordon added a commit to lorengordon/check-manifest that referenced this issue Aug 26, 2016
`_git_ls_files()` returns file paths with forward slashes. For files
in submodules, add_prefix_to_each() is called to prefix all the
submodule files with the relative path to the project root. On Windows,
the relative path contains backslashes. Since not every file in the
project is in a submodule, the combined list of files ended up with
some files that use forward slashes and others that use backslashes.

For project structures where the submodules share a path segment with
the project source, this caused `add_directories()` to create duplicate
directory entries in the list of files. The duplicate entries would
result in a traceback with a WindowsError when `os.mkdir()` was called
to create the same directory a second time.

See mgedmin#61 for more details.

This commit ensures the list of files contains only forward slashes
before calling `add_directories()`, eliminating the duplicate directory
entries.
@mgedmin
Copy link
Owner

mgedmin commented Aug 29, 2016

@lorengordon Thank you for the investigation! (Also, next time can you please open a new issue instead of commenting on an old closed one?)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants