-
Notifications
You must be signed in to change notification settings - Fork 259
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rewriting add_font() and _putfonts() using Fonttools library #477
Conversation
Codecov Report
@@ Coverage Diff @@
## master #477 +/- ##
==========================================
+ Coverage 92.34% 93.89% +1.54%
==========================================
Files 23 22 -1
Lines 6860 6093 -767
Branches 1405 1249 -156
==========================================
- Hits 6335 5721 -614
+ Misses 299 195 -104
+ Partials 226 177 -49
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
Hi! Sorry for the delay, but I think I finally got something. For doing the subsetting with fonttools I do the following steps:
In doing all of this, because of the difference of the generated subset font, several tests break. I visually inspected every PDF produced with the new code and they seem identical to me. Another thing to mention is that A recap of the current situation:
|
Cool to see this coming to fruition!
Better to find a free example font to add to the tests then.
That was the point after all! 👍
This is apparently not about the changed glyph mapping, so what other data would be subject to change here? Btw.: Codecov complains about reduced coverage in "util.py". I think this is because of the function |
Wow, great job @RedShy in working towards using
That is nice!
I agree with @gmischler that, if it's not too much to ask you @RedShy, a first basic unit test of using an |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks really promising, good job!
A few general comments:
- Thank you for commenting the steps you implemented, that will make code maintaince a lot easier!
- You should probably add
fonttools
tosetup.py
- Some reference PDF files sizes increased by 1/2 KB, I wonder why... That's not a blocker preventing this change, but would you know what could cause the is increase in file size? (while other reference PDF files have decreased in size)
Yes it's very nice and cool! And without the help from both of you I would be still stuck in trying to understand what's going on there! 😁
Sure! I just added a test with the old
Sorry @gmischler I don't get what do you mean here
I observed that Given that we are here, also the branch
I'm new to this, I have to just add
Yes, I'm curious about it too, I could try to inspect the qpdf files and see differences with the previous ones. I think it could be as @khaledhosny pointed out, that the fonttools subsetter doesn't drop all the tables previous dropped edit: I have updated the code keeping only the tables that the old code kept for the tests. Now every PDF file size is lesser than before, I think this is due to the optimization that fonttools does:
|
Just for information: I just found in the 1.7 specs:
Which essentially means that when we implement PDF/A-1 (based on PDF 1.4), we'll have to disallow OTF fonts. |
So you can show conclusively that all imported fonts end up with That brings me to a bit of a nitpick: I've always wondered about the naming of the
Generally speaking, the absence of tests would rather be a reason to add tests than to remove the code... edit: TTF is always Unicode |
Hi @gmischler thank you for the feedback 😁
Looking at the code at lines L1831, L1881 and L1893 in
Yes, because it goes through
I'm not sure of the exact differences between OpenType and TrueType, currently they are treated the same as
Looking at the code I see
Yes I absolutely agree, I tried to add some tests back in #439 and we concluded that was not worth it to improve the testing and clarity of that part of the code. |
(just a side note: thanks a lot to both of you for working on improving this part of the code! I don't have as much insight on the subject as you, but I fully trust the both of you on this. I'll be happy to merge your PR @RedShy whenever @gmischler gives his approval) |
That looks like conclusive proof: All imported fonts are marked as TTF.
Let's get rid of the cruft!
As far as I understand, they share largely the same structure, but some specific types of data may only be present in one or the other. I don't think we need to worry about the distinction right now, though it may possibly become relevant once we try to substitute ligatures etc. (cool follow-up project, btw., in case you're interested 😉).
I don't think we need to mark OTF fonts differently at this point. Feel free to rename those properties, though. |
Great! I deleted the code and renamed those properties
Yes why not? 😁 It's a pleasure to work with both of you and I'm starting to get used to the codebase, the various concepts inherent to fonts, glyphs etc... |
I was hoping for that. You're now the contributor here most familiar with the font handling code and especially the fonttools. I had only looked at these things very superficially myself, to get a general idea of the concepts. More urgently: In the context of #511 you mentioned that you had moved the |
Actually your help was fundamental to understanding what the subsetting was about and how the management of the fonts worked
I'm not sure what do you mean by substitutions here
Yes I deleted the |
Supporting ligatures means that a font may offer you the opportunity to writee eg. the two characters "fs", but have the single combined glyph "" being displayed (note how the upper end of the "f" connects with the dot of the "i", which they otherwise wouldn't. See Google fonts: Ligature for examples related to HTML rendering. This is just one of the simplest cases though, and in western languages it's usually just an esthetical nicety. In other writing systems, most prominently of the indic family, it is an actual necessity for being able to write correctly. See our recent issues #365, #459, and #474. In those cases, larger numbers of characters (up to 7 I think) may get combined into one or several glyphs, in an m * n relationship. There are several possible table types in TTF/OTF fonts that can be used to store and retreive such substitutions, such as "gsub", "liga", "dlig", etc. I hope that fonttools offer some support in searching for those, so that we don't have to figure out all the possible combinations ourselfes... |
@Lucas-C , I don't see any obstacles here anymore, so I'll vote for merging. |
Thank you both for you work on this PR! |
This has been released in v2.5.7 |
This is interesting and sparks me to challenge myself with this project, also it would have a direct impact and be helpful for people out there. |
I want to share what I'm doing, so you can suggest change in direction, see the progress, jump in and so on
If understood correctly, for now the aim is to rewrite
.add_font()
and._putfonts()
to use Fonttools and drop the home madettfonts.py.
Now
.add_font()
get all the data from Fonttools except for thettf.fullName
. Still a cleanup and refactor is needed.I'm working on
.makeSubset()
method and replacing it piece by piece with Fonttools calls. The tableshead
,hhea
,maxp
andcmap
are currently extracted using Fonttools and currently I'm working onhmtx
tableThe GitHub pipeline is OK (green) meaning that both
pylint
(static code analyzer) andblack
(code formatter) are happy with the changes of this PR.A unit test is covering the code added / modified by this PR
This PR is ready to be merged
In case of a new feature, docstrings have been added, with also some documentation in the
docs/
folderA mention of the change is present in
CHANGELOG.md
By submitting this pull request, I confirm that my contribution is made under the terms of the GNU LGPL 3.0 license.