Skip to content

Commit

Permalink
Version 1.16.1
Browse files Browse the repository at this point in the history
  • Loading branch information
Belval authored Sep 5, 2022
1 parent 9d1adee commit aa6aee8
Show file tree
Hide file tree
Showing 12 changed files with 403 additions and 390 deletions.
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# pdf2image
[![CircleCI](https://circleci.com/gh/Belval/pdf2image/tree/master.svg?style=svg)](https://circleci.com/gh/Belval/pdf2image/tree/master) [![PyPI version](https://badge.fury.io/py/pdf2image.svg)](https://badge.fury.io/py/pdf2image) [![codecov](https://codecov.io/gh/Belval/pdf2image/branch/master/graph/badge.svg)](https://codecov.io/gh/Belval/pdf2image) [![Downloads](https://pepy.tech/badge/pdf2image/month)](https://pepy.tech/project/pdf2image) [![Documentation Status](https://readthedocs.org/projects/pdf2image/badge/?version=latest)](https://pdf2image.readthedocs.io/en/latest/?badge=latest)

A python (3.6+) module that wraps pdftoppm and pdftocairo to convert PDF to a PIL Image object
A python (3.7+) module that wraps pdftoppm and pdftocairo to convert PDF to a PIL Image object

## How to install

Expand Down Expand Up @@ -68,9 +68,9 @@ with tempfile.TemporaryDirectory() as path:

Here are the definitions:

`convert_from_path(pdf_path, dpi=200, output_folder=None, first_page=None, last_page=None, fmt='ppm', jpegopt=None, thread_count=1, userpw=None, use_cropbox=False, strict=False, transparent=False, single_file=False, output_file=str(uuid.uuid4()), poppler_path=None, grayscale=False, size=None, paths_only=False, use_pdftocairo=False, timeout=600)`
`convert_from_path(pdf_path, dpi=200, output_folder=None, first_page=None, last_page=None, fmt='ppm', jpegopt=None, thread_count=1, userpw=None, use_cropbox=False, strict=False, transparent=False, single_file=False, output_file=str(uuid.uuid4()), poppler_path=None, grayscale=False, size=None, paths_only=False, use_pdftocairo=False, timeout=600, hide_attributes=False)`

`convert_from_bytes(pdf_file, dpi=200, output_folder=None, first_page=None, last_page=None, fmt='ppm', jpegopt=None, thread_count=1, userpw=None, use_cropbox=False, strict=False, transparent=False, single_file=False, output_file=str(uuid.uuid4()), poppler_path=None, grayscale=False, size=None, paths_only=False, use_pdftocairo=False, timeout=600)`
`convert_from_bytes(pdf_file, dpi=200, output_folder=None, first_page=None, last_page=None, fmt='ppm', jpegopt=None, thread_count=1, userpw=None, use_cropbox=False, strict=False, transparent=False, single_file=False, output_file=str(uuid.uuid4()), poppler_path=None, grayscale=False, size=None, paths_only=False, use_pdftocairo=False, timeout=600, hide_attributes=False)`

## What's new?

Expand Down
19 changes: 12 additions & 7 deletions docs/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,19 +11,20 @@
# add these directories to sys.path here. If the directory is relative to the
# documentation root, use os.path.abspath to make it absolute, like shown here.
#
# import os
# import sys
# sys.path.insert(0, os.path.abspath('.'))
import os
import sys

sys.path.insert(0, os.path.abspath(".."))


# -- Project information -----------------------------------------------------

project = "pdf2image"
copyright = "2019, Edouard Belval"
copyright = "2022, Edouard Belval"
author = "Edouard Belval"

# The short X.Y version
version = ""
version = "1.16.1"
# The full version, including alpha/beta/rc tags
release = "latest"

Expand All @@ -40,6 +41,10 @@
extensions = [
"sphinx.ext.mathjax",
"sphinx.ext.viewcode",
"sphinx.ext.autodoc",
"sphinx.ext.coverage",
"recommonmark",
"sphinx_rtd_theme",
]

# Add any paths that contain templates here, relative to this directory.
Expand All @@ -59,7 +64,7 @@
#
# This is also used if you do content translation via gettext catalogs.
# Usually you set "language" from the command line for these cases.
language = None
language = "en"

# List of patterns, relative to source directory, that match files and
# directories to ignore when looking for source files.
Expand All @@ -75,7 +80,7 @@
# The theme to use for HTML and HTML Help pages. See the documentation for
# a list of builtin themes.
#
html_theme = "alabaster"
html_theme = "sphinx_rtd_theme"

# Theme options are theme-specific and customize the look and feel of a theme
# further. For a list of options available for each theme, see the
Expand Down
1 change: 1 addition & 0 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -15,4 +15,5 @@ If you are new to the project, start with the installation section!

installation
overview
known_issues
reference
23 changes: 4 additions & 19 deletions docs/installation.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,22 +33,7 @@ Poppler is the underlying project that does the magic in pdf2image. You can chec

### Windows

1. Download the latest package from http://blog.alivate.com.au/poppler-windows/
2. Extract the package
3. Move the extracted directory to the desired place on your system
4. Add the `bin/` directory to your [PATH](https://www.architectryan.com/2018/03/17/add-to-the-path-on-windows-10/)
5. Test that all went well by opening `cmd` and making sure that you can call `pdftoppm -h`

## Solution for DocuSign issue
If you have this [error](https://stackoverflow.com/questions/66636441/pdf2image-library-failing-to-read-pdf-signed-using-docusign):
```bash
pdf2image.exceptions.PDFPageCountError: Unable to get page count.
Syntax Error: Gen inside xref table too large (bigger than INT_MAX)
Syntax Error: Invalid XRef entry 3
Syntax Error: Top-level pages object is wrong type (null)
Command Line Error: Wrong page range given: the first page (1) can not be after the last page (0).
```

You are possibly using an old version of poppler. The solution is to update to the latest version. Similarly, if you are working with Docker (Debian 11 Image), maybe you can not update poppler because is not available. So, you have to use an image in ubuntu, install Python and then what you need.

More details [here](https://github.com/Belval/pdf2image/issues/234).
1. Download the latest poppler package from [@oschwartz10612 version](https://github.com/oschwartz10612/poppler-windows/releases/) which is the most up-to-date.
2. Move the extracted directory to the desired place on your system
3. Add the `bin/` directory to your [PATH](https://www.architectryan.com/2018/03/17/add-to-the-path-on-windows-10/)
4. Test that all went well by opening `cmd` and making sure that you can call `pdftoppm -h`
17 changes: 17 additions & 0 deletions docs/known_issues.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
# Limitations / Known Issues

## DocuSign PDFs

If you have this [error](https://stackoverflow.com/questions/66636441/pdf2image-library-failing-to-read-pdf-signed-using-docusign):

```bash
pdf2image.exceptions.PDFPageCountError: Unable to get page count.
Syntax Error: Gen inside xref table too large (bigger than INT_MAX)
Syntax Error: Invalid XRef entry 3
Syntax Error: Top-level pages object is wrong type (null)
Command Line Error: Wrong page range given: the first page (1) can not be after the last page (0).
```

You are possibly using an old version of poppler. The solution is to update to the latest version. Similarly, if you are working with Docker (Debian 11 Image), maybe you can not update poppler because is not available. So, you have to use an image in ubuntu, install Python and then what you need.

More details [here](https://github.com/Belval/pdf2image/issues/234).
189 changes: 0 additions & 189 deletions docs/reference.md

This file was deleted.

20 changes: 20 additions & 0 deletions docs/reference.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
Reference
**********

Main functions
--------------

.. automodule:: pdf2image.pdf2image
:members:

Exceptions
----------

.. automodule:: pdf2image.exceptions
:members:

Parsers
-------

.. automodule:: pdf2image.parsers
:members:
10 changes: 5 additions & 5 deletions pdf2image/exceptions.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,30 +4,30 @@


class PopplerNotInstalledError(Exception):
"""Happens when poppler is not installed"""
"""Raised when poppler is not installed"""

pass


class PDFInfoNotInstalledError(PopplerNotInstalledError):
"""Happens when pdfinfo is not installed"""
"""Raised when pdfinfo is not installed"""

pass


class PDFPageCountError(Exception):
"""Happens when the pdfinfo was unable to retrieve the page count"""
"""Raised when the pdfinfo was unable to retrieve the page count"""

pass


class PDFSyntaxError(Exception):
"""Syntax error was thrown during rendering"""
"""Raised when a syntax error was thrown during rendering"""

pass


class PDFPopplerTimeoutError(Exception):
"""Timeout when pdf convert image."""
"""Raised when the timeout is exceeded while converting a PDF"""

pass
Loading

0 comments on commit aa6aee8

Please sign in to comment.