Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Supports Dynamic Language Loading for Smaller Download Size #488

Open
Chanakan5591 opened this issue Jul 27, 2023 · 4 comments
Open

Supports Dynamic Language Loading for Smaller Download Size #488

Chanakan5591 opened this issue Jul 27, 2023 · 4 comments

Comments

@Chanakan5591
Copy link

Chanakan5591 commented Jul 27, 2023

Coming from here: #460 (comment)

I think it would also be a good idea to support dynamic font loading for languages like this, for example, Thai is not included in CJK fonts set. So if we were to bundle the fonts, it will increase the image size by a lot. And currently any Thai languages from a document file will result in tofu.

@deeplow
Copy link
Contributor

deeplow commented Jul 27, 2023

Thanks for reporting this. Assessing the current language support state and implementing a solution like dynamic loading is something that we plan to do.

Regarding specifically Thai, do you know what would be needed to add its support? never mind. I see now that you explained this on our other discussion. Pasting it here for future reference:

the Noto Thai font here: https://github.com/notofonts/thai.

@deeplow deeplow changed the title Supports Dynamic Font Loading Supports Dynamic Language Loading Aug 14, 2023
@deeplow
Copy link
Contributor

deeplow commented Aug 14, 2023

Increased the scope of this issue to consider the wider problem of Language support. For a language to be fully supported we need both the fonts on the "doc to pixels" part (for proper rendering) as well as OCR models in the "pixels to PDF" part.

The goal will be to find a scalable way to achieve this. OCR models add inflate the container image size by a lot. If we could trim that down and only download them on a "per-need" basis that would be perfect.

Offline-mode

The other day the idea of an offline-ready version of Dangerzone surfaced, where all of these models would be pre-downloaded already.

@deeplow deeplow changed the title Supports Dynamic Language Loading Supports Dynamic Language Loading for Smaller Download Size Aug 14, 2023
@deeplow
Copy link
Contributor

deeplow commented Aug 14, 2023

For comparison, here's a breakdown of the application sizes:

Language Content container.tar.gz size Dangerzone.app size
current (0.4.2) 900M 963.6M
only eng OCR model 452M 578.5M
only eng OCR model and no CJK fonts 381M 503.4M

@deeplow
Copy link
Contributor

deeplow commented Nov 11, 2024

Some notes about things to keep in mind about this on Qubes:

  • it won't work on offline qubes where the client is run
  • the default user's /home size is 5GiB. Similarly to what happened on the 0.8.0 release, there may be a need to update the install instructions to increase the size of the client qube by 5GiB or more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants