Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

File too large in the GIT history #5140

Closed
garrapato opened this issue Sep 1, 2019 · 7 comments
Closed

File too large in the GIT history #5140

garrapato opened this issue Sep 1, 2019 · 7 comments

Comments

@garrapato
Copy link

I have noticed that the repository contains a very large file in the GIT history. Is it possible that someone made a commit with some file or directory by mistake? It can be for example nodes_modules or some other like that, unless that really is the correct size of the repository.

I just made a clone of the repository and it took a long time to download it.

The file I refer to in a fresh clone of the repository is:

scratch-gui/.git/objects/pack/pack-c81e535f2cf1cd650ef7a6e69553ee444473a465.pack

Expected Behavior

Less time to clone (download) the repo

Actual Behavior

I just made a clone of the repository and it took a very long time to download it.

Steps to Reproduce

$ git clone https://github.com/LLK/scratch-gui.git
Cloning into 'scratch-gui'...
remote: Enumerating objects: 48, done.
remote: Counting objects: 100% (48/48), done.
remote: Compressing objects: 100% (47/47), done.
Receiving objects:  100% (67966/67966), 11.68 MiB | 815.00 KiB

$ cd scratch-gui
$ du -ch . | grep "G\t"
1.0G	./.git/objects/pack
1.1G	./.git/objects
1.1G	./.git
1.1G	.
1.1G	total

$ cd ./.git/objects/pack
$ ll
total 2199344
-r--r--r--  1 garrapato  staff     1904120 Sep  1 06:09 pack-c81e535f2cf1cd650ef7a6e69553ee444473a465.idx
-r--r--r--  1 garrapato  staff  1111074667 Sep  1 06:24 pack-c81e535f2cf1cd650ef7a6e69553ee444473a465.pack

The file already measures more than 1 GB!

Possible solution

If a file or directory was uploaded (committed) by mistake, it must be deleted from the story and the following article shows how to do it:

Removing sensitive data from a repository

Operating System and Browser

Mac OS 10.11.14
Chrome Versión 76.0.3809.132 (Build oficial) (64 bits)

I hope this information will be useful

Regards

@apple502j
Copy link
Contributor

Maybe this is because of tutorial animated GIFs. We recommend the use of --depth 10 when cloning this repo.

@Kenny2github
Copy link
Contributor

In general, this repo's length commit history will make a full clone take an incredible amount of time. I recommend --depth 1 rather than 10 because it really can get to be too much.

@rschamp
Copy link
Contributor

rschamp commented Feb 10, 2020

There are indeed a large number of static image files. There's not anything we can do about the size of the history without causing a lot of conflicts.

@rschamp rschamp closed this as completed Feb 10, 2020
@matschaffer
Copy link

Do maintainers typically just take the time/space for a full clone? I was under the impression you can't branch/commit/pull on a shallow clone.

@matschaffer
Copy link

I just did that. Looks like it's 1.9GB on disk and took about 33min on my connection for a full clone.

@matschaffer
Copy link

I was curious so I tried this https://stackoverflow.com/a/42544963/69002

It seems like what's taking up a lot of the space is dependencies being commited to the gh-pages branch. LIke d33ef36 for example.

Those lib.min.js seem to be 15MB-20MB each and get committed a few times a day in a few different subdirectories.

@andreabedini
Copy link

With the insight that big files are only in the gh-pages branch (which is usually disconnected from main), the situation can be improved using --single-branch when cloning.

~/code/scratch 
❯ git clone https://github.com/LLK/scratch-gui --single-branch 
Cloning into 'scratch-gui'...
remote: Enumerating objects: 44888, done.
remote: Counting objects: 100% (56/56), done.
remote: Compressing objects: 100% (26/26), done.
remote: Total 44888 (delta 38), reused 44 (delta 30), pack-reused 44832
Receiving objects: 100% (44888/44888), 313.38 MiB | 4.37 MiB/s, done.
Resolving deltas: 100% (29577/29577), done.

~/code/scratch [⏱ 1m14s]
❯ du -hs scratch-gui
392M	scratch-gui

~1 minute and ~400MB seem acceptable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants