Unable to extract all images from PDF #6

Davidwhw · 2024-05-29T10:10:09Z

Sorry to bring up the same issue as pdf_parser/issues/1#issue-2307687422, because I have not received a reply and urgently need a solution.

When I use the pdffigures2 backend to extract images from a PDF, there are often images that are overlooked. For example, pdf_parser extracts only 3 images from a PDF file that contains 5 images. (In fact, in my observation, pdffigures2 is the best of the three image extraction backends, cermine will cut a complete image into pieces.)
I guess maybe the pdffigures2 backend uses default parameters such as "image size" or "resolution" to filter the images?
Can you give me some advice or clues?
Thank you for your assistance.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unable to extract all images from PDF #6

Unable to extract all images from PDF #6

Davidwhw commented May 29, 2024

Unable to extract all images from PDF #6

Unable to extract all images from PDF #6

Comments

Davidwhw commented May 29, 2024