Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Source files too large to upload to github #554

Closed
tay1orjones opened this issue Jan 23, 2024 · 13 comments · Fixed by #555 or #560
Closed

Source files too large to upload to github #554

tay1orjones opened this issue Jan 23, 2024 · 13 comments · Fixed by #555 or #560

Comments

@tay1orjones
Copy link
Member

Trying to cut the release today, ran into the following

image

This is similar to #453 but different - we're now hitting the filesize limit on this repo with the actual source files. Not related to yarn cache or anything else, this is just plainly too large.

There's two options to resolve:

  1. Turn on and use git-lfs
  2. No longer include source files in the repository, instead include them as release artifacts

I think the second option makes the most sense. It will be some additional manual steps for now during the release process but in the future we could automate it.

@BoldMonday
Copy link
Collaborator

There is a third option I think: to convert the .glyphs files to .glyphspackage. AFAIK this will split the file into lots of smaller ones, similar to .ufo format.

If that works (I need to check) that would be my preferred solution.

@BoldMonday
Copy link
Collaborator

@tay1orjones Just checked and my assumption about .glyphspackage files is correct.
Let me create a new package with all files and I’ll let you know when it’s done.

@BoldMonday
Copy link
Collaborator

@tay1orjones Done! I’ve replaced the older package.

@tay1orjones
Copy link
Member Author

Thanks @BoldMonday - I've actually ran into an even larger problem to reopen this issue. The overall package size is now too large to publish to npm

image

The tentative plan is to get the existing updates that have been queueing since May into @ibm/plex and we'll release traditional chinese into it's own package at @ibm/plex-traditional-chinese

From there, the new maintainers will need to work on converting the rest of the repository into supporting publishing packages per-family instead of into one big @ibm/plex, #453

@NightFurySL2001
Copy link

Maybe it'll be a good idea to split JP/KR/TC/SC into a CJK repo similar to Source Han/Noto Sans CJK?

@tay1orjones
Copy link
Member Author

Yeah the plan is to keep it all in one repo but the families will split up into individual packages on npm. Full details in #452, the consensus is to go with "Option 1" outlined there.

@tay1orjones tay1orjones changed the title Source files too large Source files too large to upload to github Jan 26, 2024
@tay1orjones
Copy link
Member Author

I've got a PR up addressing the first piece of this - removing the source files from the repo. I opened a new issue for the npm publish blocker for Plex TC: #561

@tay1orjones
Copy link
Member Author

#452 (comment)

@BoldMonday
Copy link
Collaborator

@tay1orjones I think it's a pity that source files are not stored in the repo anymore.
Is converting all .glyphs files to .glyphspackage not a viable solution?

@tay1orjones
Copy link
Member Author

@BoldMonday I agree, unfortunately it's unavoidable as Plex grows. Yes, the .glyphspackage fix technically unblocks the issue I screenshotted at the opening of this issue, although the overall size of the repo is still a big concern. The repo size today is ~500mb. Adding the TC folder alone will be another 1.1GB.

GitHub recommends repo size stay below 1GB.

The source files will still be included in the repo under the releases tab as artifacts on each release moving forward. So they'll be there in the release history, just not fully version controlled anymore. The source folder has never been published to npm either, so this should be a pretty small impact of a change.

@NightFurySL2001
Copy link

NightFurySL2001 commented Jan 26, 2024

Any chances the source files for Chinese/CJK can still be tracked in a separate branch per language? Maybe as source-tc, source-jp, source-kr (and upcoming source-sc)? This would still keep the repo roughly under 1GB (per branch, that is), but at least it's still under version control. This could also be done for the webfonts.

Source Han has two branches under separate version control, essentially keeping two separate timelines (source files and release fonts) in one repo. The only downside is releases and tags will need to be made twice, and also the git history of the whole repo will keep increasing in size (Source Han repo git history has reached 10GB after ~10 years and 8 major version release).

@tay1orjones
Copy link
Member Author

@NightFurySL2001 My understanding is that a repository includes all branches stored on the remote, as well as the full repo history. Having them on separate branches wouldn't reduce the overall repo size to my knowledge.

What's the benefit to keeping the sources version controlled? They're currently not updated any more frequently than when a release is published, and the files can't be diffed within github. Is there a tool that supports diffing two versions stored in a git repository? I'm not sure what the added benefit or use case might be for storing them in version control.

@NightFurySL2001
Copy link

NightFurySL2001 commented Jan 26, 2024

At least for glyphs, .glyphspackage and .ufo files, cloning the repo and doing a standard git diff with git diff ver1 ver2 can show which glyphs, features or metrics are changed between versions, since these 3 formats are text based. .glyphspackage and .ufo are slightly more easier to work with because they store each glyph in separate files, so files that differ between versions (using git diff --name-only) are glyphs/metrics that are modified between versions. .glyphs however is a full list of glyphs in one single file so will require to check the lines individually to see which corresponding glyphs are changed.

It's similar to the Plex release log on what has changed, but since CJK has plenty glyphs it'll be more efficient to just use git diff. (See #556 for a list of errors and suggestions for TC, which is pretty long) Also, releases are a GitHub specific feature and is not included in the git repository, so a standard git clone can't get the source files in releases, instead requiring to do cURL call to GitHub server separately. I digress though since CJK font files are inherently big and plenty of projects do provide alternate means to get the font source files instead of tracking it directly in git.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants