Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PDF to Image conversion skips signature image created with Adobe Reader #191

Closed
Tlaloc-Es opened this issue Mar 14, 2023 · 11 comments
Closed
Assignees
Labels
bug Something isn't working needsinfo Additional information from the reporter is required

Comments

@Tlaloc-Es
Copy link

Hello,

I'm trying to convert a PDF document into an image using the render method in version 4.2.0 of the library. However, the signature image created with Adobe Reader is not being rendered in the resulting image.

Here's the code snippet I'm using:

pdf_bitmap = page.render(
    scale=3,
    rotation=0,
    crop=(0,0,0,0),
    grayscale=False,
    optimize_mode=None,
    draw_annots=True
)

Is there a way to include the signature image in the resulting image? Any help or suggestion would be greatly appreciated. Thank you.

@mara004
Copy link
Member

mara004 commented Mar 14, 2023

Maybe the image is part of an AcroForm or something? Try calling pdf.init_forms() directly after constructing the PdfDocument. If this does not fix your problem, I'd need to see the PDF in question.

@mara004 mara004 added the needsinfo Additional information from the reporter is required label Mar 14, 2023
@mara004 mara004 self-assigned this Mar 14, 2023
@Tlaloc-Es
Copy link
Author

Tlaloc-Es commented Mar 14, 2023

Here's the corrected text with minor edits and fixed typos:

We tried the following code:

pdf = pdfium.PdfDocument(source)
pdf.init_form()
page = pdf.get_page(0)
pdf_bitmap = page.render(
    scale=3,
    rotation=0,
    crop=(0,0,0,0),
    grayscale=False,
    optimize_mode=None,
    draw_annots=True
)
pdf_bitmap.to_pil().save(target)

However, it did not work and it's possible that the PDF is an AcroForm. How can I check this? Unfortunately, I cannot share the document as it's confidential. PDF2IMG works but it's too slow. I am not sure if this will help identify why the signature is not appearing

@mara004
Copy link
Member

mara004 commented Mar 14, 2023

If the above does not work, then it's probably not about forms, but you could run
print( pdfium.internal.FormTypeToStr.get( pdf.get_formtype() ) ) to confirm.

If you can't share a PDF that demonstrates the problem, then I'm afraid we can't help you.

Apart from that, what's PDF2Miner? Is this another PDF rendering engine? Could you share a link?

@Tlaloc-Es
Copy link
Author

Hi, sorry the library is PDF2IMG was a typo, the function returns None.

@mara004
Copy link
Member

mara004 commented Mar 14, 2023

Hmm, could you maybe provide/craft a non-confidential PDF that shows the problem (or whoever has such a document or is capable of creating one) ?
Otherwise, I'll have to close this issue. Also, this sounds like a problem with PDFium itself, not with the Python bindings.

@Tlaloc-Es
Copy link
Author

I apologize, but I was unable to create the PDF file as it was created in an office and I am not familiar with the process. However, I found that another library which also uses pdfium had faced a similar issue during implementation. You may find this information useful, and I recommend taking a look at this link: pvginkel/PdfiumViewer#87.

@mara004
Copy link
Member

mara004 commented Mar 14, 2023

Thanks, the linked test file (archive url) shows the problem. I'll investigate.

@mara004 mara004 added bug Something isn't working and removed needsinfo Additional information from the reporter is required labels Mar 14, 2023
mara004 added a commit that referenced this issue Mar 14, 2023
@mara004
Copy link
Member

mara004 commented Mar 14, 2023

I found and fixed an issue with the multi-page renderer not initializing the formenv in worker processes. The signature of the file mentioned in #191 (comment) (which is part of an AcroForm) now displays correctly with the multi-page renderer (as used by the pypdfium2 render CLI).
But this does not look like your issue, because you're using the single-page renderer, and your document has no forms according to #191 (comment).

mara004 added a commit that referenced this issue Mar 14, 2023
mara004 added a commit that referenced this issue Mar 14, 2023
This is an API-breaking change.
mara004 added a commit that referenced this issue Mar 14, 2023
This is an API-breaking change.
mara004 added a commit that referenced this issue Mar 14, 2023
This is an API-breaking change.
@mara004
Copy link
Member

mara004 commented Mar 14, 2023

pypdfium2 render SampleSignedPDFDocument.pdf -o out/ prints
Unsupported PDF feature: Signature annotation for me (triggered by pdfium.PdfUnspHandler().setup()).

So even if it seems to work for this particular document, it might be a feature not fully supported by PDFium.
Can you check if PDFium in Chrome/Chromium renders your signature PDF correctly? If not, you'd need to it report upstream.

@mara004 mara004 added the needsinfo Additional information from the reporter is required label Mar 14, 2023
mara004 added a commit that referenced this issue Mar 15, 2023
This reverts commit 4d25bb2.

I'm not too sure about this change. Generally having a callback just for
the form config seems slightly uncomfortable. I think we should keep the
plain config for init_forms() and add an optional callback to
PdfDocument.render() only.
@Tlaloc-Es
Copy link
Author

Thank you, mara004. You work very quickly. Unfortunately, I was unable to check the information you provided due to the high security environment in which the PDFs are stored. Therefore, I suggest we close this issue for now. If I am able to access the information in the future, I will be sure to share it with you.

@mara004
Copy link
Member

mara004 commented Mar 17, 2023

Thanks, feel free to reopen if you get access to the PDF again.
And thanks for making me aware of the forms issue with the document-level renderer, although it was somewhat unrelated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working needsinfo Additional information from the reporter is required
Projects
None yet
Development

No branches or pull requests

2 participants