-
Notifications
You must be signed in to change notification settings - Fork 259
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New Feature: Support ligature Glyphs from TTF Fonts #540
Comments
In #549, @marcstober pointed out that some combined characters are not technically ligatures (looked up in "gsub"), but rather diacritics. Those don't need to be substituted, but they follow special placement rules found in "gpos". This is particularly important (and tricky) when several of them need to be combined with a single base character, which may require them to be stacked on top of each other. Example scripts that require this are Hebrew and Thai. I don't think is realistic to handle all those special cases directly in As a basis for such functionality, an initial refactoring might introduce a generic |
@andersonhc PR #820 has been merged today. Could you test if that solved your issue @gmischler? You can install
The documentation is there: https://pyfpdf.github.io/fpdf2/TextShaping.html |
As far as I can determine, this is now fixed. |
We've had several issues raised because fpdf2 currently doesn't correctly support writing systems that require the merging of successive characters into combined glyphs (ligatures). This appears to be mainly the case for indic scripts.
Compare #365, #381, #459, #474, and downstream global_scorecards #7.
Those ligature glyphs are usually stored in the font file with index numbers outside of the range of Unicode characters that can be represented with Python strings. This means that we need a custom data structure to represent them, which can also include some other helpful information. Note that such a ligature may actually consist of several partial glyphs, so there is an
n*m
relationship between Unicode code points and ligature glyphs.We could represent our text elements eg. similar to this:
The processing sequence might look something like this:
There are probably quite a few pitfalls that aren't obvious at the moment. We'll also need support and advice from native speakers of the respective languages, which are the only ones able to spot any errors in the resulting files. There may be other tables than "gsub" in some fonts that we might also want to take into account.
Upside of the change:
Downside:
Anyone up for the task?
The text was updated successfully, but these errors were encountered: