-
-
Notifications
You must be signed in to change notification settings - Fork 372
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Perform a shallow clone of templates on first use #1855
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this idea but I'm not sure I understand how it's going to work currently.
If we do a shallow clone of the repo, then all Briefcase can know about the repo is the last commit of the default branch. Therefore, it will not be able to checkout the necessary branch for the version of Briefcase....or an arbitrary branch from the CLI.
For instance, trying to use v0.3.18
of the template:
❯ rm -rf build/ ~/.cache/briefcase/templates/briefcase-linux-system-template
❯ briefcase create -vvv --config='template_branch="v0.3.18"'
[helloworld] Finalizing application configuration...
Targeting pop:jammy (Vendor base debian)
Determining glibc version... done
Targeting glibc 2.35
Targeting Python3.10
[helloworld] Generating application template...
Using app template: https://github.com/beeware/briefcase-linux-system-template.git, branch v0.3.18
Cloning template 'https://github.com/beeware/briefcase-linux-system-template.git'...
DEBUG git.cmd: Popen(['git', 'clone', '-v', '--depth=1', '--', 'https://github.com/beeware/briefcase-linux-system-template.git', '/home/russell/.cache/briefcase/templates/briefcase-linux-system-template'], cwd=/home/russell/tmp/beeware/helloworld, stdin=None, shell=False, universal_newlines=True)
DEBUG git.repo.base: Cmd(['git', 'clone', '-v', '--depth=1', '--', 'https://github.com/beeware/briefcase-linux-system-template.git', '/home/russell/.cache/briefcase/templates/briefcase-linux-system-template'])'s unused stdout:
DEBUG git.cmd: Popen(['git', 'remote', 'set-url', '--', 'origin', 'https://github.com/beeware/briefcase-linux-system-template.git', 'https://github.com/beeware/briefcase-linux-system-template.git'], cwd=/home/russell/.cache/briefcase/templates/briefcase-linux-system-template, stdin=None, shell=False, universal_newlines=False)
DEBUG git.cmd: Popen(['git', 'fetch', '-v', '--', 'origin'], cwd=/home/russell/.cache/briefcase/templates/briefcase-linux-system-template, stdin=None, shell=False, universal_newlines=False)
Could not find a template branch for Briefcase v0.3.18.
I don't have a recommended implementation but this GitHub blogpost looks insightful.
A separate comment that may warrant independent effort. While this strategy gets us away from depending on the The only configuration I can find for this is via a file in the file system; for instance:
This file path can be passed to |
Dang... I was afraid I might have missed something in my testing. Back to the drawing board, I guess... |
In a shallow clone, I think that could be done with a But if we're about to remove the binaries from the templates anyway, it might not be worth the effort of getting a shallow clone working. When we remove the binaries, we can force-push the main branch to a commit which either has no history, or a rewritten history with the binaries omitted. After that, as long as we're careful not to add anything large to the templates, a full clone should be no problem. |
We also noted just now that |
Even if we do this, the full repo history will include the blobs for all the historical branches - so the repo will still be ~500MB, even though only 1MB is actually checked out on any recent branches. Another option may be a blobless-clone - this would download all the hashes (so there's full branch history), but only download the blobs when they're needed. I'll investigate further. |
Looks like replacing I've also explicitly disabled replay usage on the cookiecutter calls; that should disable the use of the replay folder, without the need to write a configuration file. |
So, since we're still requiring a non-trivial amount of data to be downloaded and we've taken over the Git clone process from Cookiecutter, do we want to consider implementing a progress bar for GitPython? There's an example here. Before I read your comment fully that there's still 59MB to download....I was surprised when the process seemed to freeze for a minute in my VM. I'll be able to do more robust testing of this strategy tomorrow. |
It's 59MB right now - but once #1849 lands, that should drop dramatically (hopefully to < 4MB, like the Xcode template). That said, I'm not opposed to adding a progress bar. The only hesitation I have about leaning more heavily on GitPython specifically is that the project is currently marked as "maintenance mode only", so there's an outside chance we may need to replace that package at some point in the not-too-distant future. However, if it's low effort to implement, then we might as well lean on it while we can; if we need to fall back to a simple waitbar in the future, then so be it. |
I'll leave this to @rmartin16 as he's more familiar with this area. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few in-line comments/suggestions.
It's 59MB right now - but once #1849 lands, that should drop dramatically (hopefully to < 4MB, like the Xcode template).
All right...although, if the plan is to rewrite the currently large repos' histories, then this seems like it will mostly be a service for large third-party repos since the benefits will be marginal for Briefcase defaults....which lends credence towards Malcolm's comment about "why do this at all?". However, we have already done it at this point and the change doesn't seem that risky to me.
That said, I'm not opposed to adding a progress bar. The only hesitation I have about leaning more heavily on GitPython specifically is that the project is currently marked as "maintenance mode only", so there's an outside chance we may need to replace that package at some point in the not-too-distant future. However, if it's low effort to implement, then we might as well lean on it while we can; if we need to fall back to a simple waitbar in the future, then so be it.
Briefly looking at GitPython's requirements, I don't think it's trivial....but once the histories are rewritten, the immediate need will be removed anyway...so, may as well push this off as well.
Co-authored-by: Russell Martin <russell@rjm.li>
To be clear - the current plan is to not rewrite the repo histories, as the blobless clone means users won't have any large binary artefacts in their repo as soon as #1849 (and the related template changes) land. Without this change, #1849 will mean the final checked out template is tiny... but you will still be cloning 200+MB of repo history. Rewriting repo history is one way to address that without this change, but given that we've got a clean solution that results in only downloading what you need, we might as well lean into that. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we haven't spent a lot of time in this code, I spent a bit of time nitpicking....but I'm done digging in after these comments :) I expect to approve once you send it back tonight/in the morning.
Co-authored-by: Russell Martin <russell@rjm.li>
I've cleaned up those comments and error messages; #1849 has also been merged, and with the new "stables" repo, a clean macOS app template is 1.3 MB. Ready for final(?) review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One small typo fix and we're good to go 👍🏼
Co-authored-by: Russell Martin <russell@rjm.li>
Cookiecutter's default behavior is to perform a full clone of any repository passed as a URL. This is proving problematic for the macOS and Windows app templates, because the full history contains a lot of old binaries. Unfortunately, cookiecutter doesn't appear to have any setting to perform a shallow clone. However, it can operate on a local file path, and we already have tools to manage branches and remotes in place.
This PR makes 3 changes:
~/.cookiecutters
directory.This is a partial fix for #933; as it reduces the size of the template clone that is required. The full fix is #1849, as that will only download the stub for the Python version that is in use.
Refs #933.
PR Checklist: