Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Letter GlyphRectangle with Height == 0 #287

Closed
simonedd opened this issue Feb 10, 2021 · 5 comments
Closed

Letter GlyphRectangle with Height == 0 #287

simonedd opened this issue Feb 10, 2021 · 5 comments
Labels

Comments

@simonedd
Copy link

Hi,

In this document, all the letters have a GlyphRectangle with a Height equals to zero.
A1-RU-101-S.pdf

@BobLd
Copy link
Collaborator

BobLd commented Feb 10, 2021

Hi @simonedd, have you tried compiling and testing this document with the current master? Some fixes have been push to fix bbox height recently - but your document might be another case of this issue

@simonedd
Copy link
Author

Yes, I tested with the latest version.

@EliotJones EliotJones added the bug label Feb 11, 2021
@BobLd
Copy link
Collaborator

BobLd commented Feb 13, 2021

The issue comes from the Type0CidFont class (in UglyToad.PdfPig.PdfFonts.CidFonts).

I've added the following property private readonly double scale; it the top, that is set in the constructor:
scale = 1 / (double)(fontProgram?.GetFontMatrixMultiplier() ?? 1000);

Then in the GetBoundingBox(int characterIdentifier) function:

public PdfRectangle GetBoundingBox(int characterIdentifier)
{
// TODO: correct values
if (characterIdentifier < 0)
{
throw new ArgumentException($"The provided character identifier was negative: {characterIdentifier}.");
}
if (fontProgram == null)
{
return Descriptor?.BoundingBox ?? new PdfRectangle(0, 0, 1000, 0);
}
if (fontProgram.TryGetBoundingBox(characterIdentifier, out var boundingBox))
{
return boundingBox;
}
if (Widths.TryGetValue(characterIdentifier, out var width))
{
return new PdfRectangle(0, 0, width, 0);
}
if (defaultWidth.HasValue)
{
return new PdfRectangle(0, 0, defaultWidth.Value, 0);
}
return new PdfRectangle(0, 0, 1000, 0);
}

I've amended the code as follow:

        public PdfRectangle GetBoundingBox(int characterIdentifier)
        {
            // TODO: correct values
            if (characterIdentifier < 0)
            {
                throw new ArgumentException($"The provided character identifier was negative: {characterIdentifier}.");
            }

            if (fontProgram == null)
            {
                return Descriptor?.BoundingBox ?? new PdfRectangle(0, 0, 1000, 1.0 / scale);
            }

            if (fontProgram.TryGetBoundingBox(characterIdentifier, out var boundingBox))
            {
                return boundingBox;
            }

            if (Widths.TryGetValue(characterIdentifier, out var width))
            {
                return new PdfRectangle(0, 0, width, 1.0 / scale);
            }

            if (defaultWidth.HasValue)
            {
                return new PdfRectangle(0, 0, defaultWidth.Value, 1.0 / scale);
            }

            return new PdfRectangle(0, 0, 1000, 1.0 / scale);
        }

This is the result, which seems to be in line with what Acrobat reader displays (Letters in blue, words in red):
image

I am really not sure it would work for every document. I'll try to push a PR if I manage to have a better understanding.

@BobLd
Copy link
Collaborator

BobLd commented Aug 17, 2021

closed as fixed in 0.1.5

@BobLd BobLd closed this as completed Aug 17, 2021
@mayurjansari
Copy link

unicode character has height 0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants