-
Notifications
You must be signed in to change notification settings - Fork 6.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds document text detection tutorial. #868
Changes from 2 commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
output-text.jpg |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,110 @@ | ||
.. This file is automatically generated. Do not edit this file directly. | ||
|
||
Google Cloud Vision API Python Samples | ||
=============================================================================== | ||
|
||
This directory contains samples for Google Cloud Vision API. `Google Cloud Vision API`_ allows developers to easily integrate vision detection features within applications, including image labeling, face and landmark detection, optical character recognition (OCR), and tagging of explicit content | ||
|
||
|
||
|
||
|
||
.. _Google Cloud Vision API: https://cloud.google.com/vision/docs | ||
|
||
Setup | ||
------------------------------------------------------------------------------- | ||
|
||
|
||
Authentication | ||
++++++++++++++ | ||
|
||
Authentication is typically done through `Application Default Credentials`_, | ||
which means you do not have to change the code to authenticate as long as | ||
your environment has credentials. You have a few options for setting up | ||
authentication: | ||
|
||
#. When running locally, use the `Google Cloud SDK`_ | ||
|
||
.. code-block:: bash | ||
|
||
gcloud beta auth application-default login | ||
|
||
|
||
#. When running on App Engine or Compute Engine, credentials are already | ||
set-up. However, you may need to configure your Compute Engine instance | ||
with `additional scopes`_. | ||
|
||
#. You can create a `Service Account key file`_. This file can be used to | ||
authenticate to Google Cloud Platform services from any environment. To use | ||
the file, set the ``GOOGLE_APPLICATION_CREDENTIALS`` environment variable to | ||
the path to the key file, for example: | ||
|
||
.. code-block:: bash | ||
|
||
export GOOGLE_APPLICATION_CREDENTIALS=/path/to/service_account.json | ||
|
||
.. _Application Default Credentials: https://cloud.google.com/docs/authentication#getting_credentials_for_server-centric_flow | ||
.. _additional scopes: https://cloud.google.com/compute/docs/authentication#using | ||
.. _Service Account key file: https://developers.google.com/identity/protocols/OAuth2ServiceAccount#creatinganaccount | ||
|
||
Install Dependencies | ||
++++++++++++++++++++ | ||
|
||
#. Install `pip`_ and `virtualenv`_ if you do not already have them. | ||
|
||
#. Create a virtualenv. Samples are compatible with Python 2.7 and 3.4+. | ||
|
||
.. code-block:: bash | ||
|
||
$ virtualenv env | ||
$ source env/bin/activate | ||
|
||
#. Install the dependencies needed to run the samples. | ||
|
||
.. code-block:: bash | ||
|
||
$ pip install -r requirements.txt | ||
|
||
.. _pip: https://pip.pypa.io/ | ||
.. _virtualenv: https://virtualenv.pypa.io/ | ||
|
||
Samples | ||
------------------------------------------------------------------------------- | ||
|
||
Document Text tutorial | ||
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ | ||
|
||
|
||
|
||
To run this sample: | ||
|
||
.. code-block:: bash | ||
|
||
$ python doctext.py | ||
|
||
usage: doctext.py [-h] image_file | ||
|
||
positional arguments: | ||
image_file The image for text detection. | ||
|
||
optional arguments: | ||
-h, --help show this help message and exit | ||
|
||
|
||
|
||
|
||
The client library | ||
------------------------------------------------------------------------------- | ||
|
||
This sample uses the `Google Cloud Client Library for Python`_. | ||
You can read the documentation for more details on API usage and use GitHub | ||
to `browse the source`_ and `report issues`_. | ||
|
||
.. Google Cloud Client Library for Python: | ||
https://googlecloudplatform.github.io/google-cloud-python/ | ||
.. browse the source: | ||
https://github.com/GoogleCloudPlatform/google-cloud-python | ||
.. report issues: | ||
https://github.com/GoogleCloudPlatform/google-cloud-python/issues | ||
|
||
|
||
.. _Google Cloud SDK: https://cloud.google.com/sdk/ |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,22 @@ | ||
# This file is used to generate README.rst | ||
|
||
product: | ||
name: Google Cloud Vision API | ||
short_name: Cloud Vision API | ||
url: https://cloud.google.com/vision/docs | ||
description: > | ||
`Google Cloud Vision API`_ allows developers to easily integrate vision | ||
detection features within applications, including image labeling, face and | ||
landmark detection, optical character recognition (OCR), and tagging of | ||
explicit content. | ||
|
||
setup: | ||
- auth | ||
- install_deps | ||
|
||
samples: | ||
- name: Document Text tutorial | ||
file: doctext.py | ||
show_help: True | ||
|
||
cloud_client_library: true |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,125 @@ | ||
#!/usr/bin/env python | ||
|
||
# Copyright 2017 Google Inc. All Rights Reserved. | ||
# | ||
# Licensed under the Apache License, Version 2.0 (the "License"); | ||
# you may not use this file except in compliance with the License. | ||
# You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
|
||
"""Outlines document text given an image. | ||
|
||
Example: | ||
python doctext.py resources/text_menu.jpg | ||
""" | ||
# [START full_tutorial] | ||
# [START imports] | ||
import argparse | ||
from enum import Enum | ||
import io | ||
|
||
from google.cloud import vision | ||
from PIL import Image, ImageDraw | ||
# [END imports] | ||
|
||
|
||
class FeatureType(Enum): | ||
PAGE = 1 | ||
BLOCK = 2 | ||
PARA = 3 | ||
WORD = 4 | ||
SYMBOL = 5 | ||
|
||
|
||
def draw_boxes(image, blocks, color): | ||
"""Draw a border around the image using the hints in the vector list.""" | ||
# [START draw_blocks] | ||
draw = ImageDraw.Draw(image) | ||
|
||
for block in blocks: | ||
draw.polygon([block.vertices[0].x, block.vertices[0].y, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ew hanging indents.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done. |
||
block.vertices[1].x, block.vertices[1].y, | ||
block.vertices[2].x, block.vertices[2].y, | ||
block.vertices[3].x, block.vertices[3].y], None, color) | ||
return image | ||
# [END draw_blocks] | ||
|
||
|
||
def get_document_bounds(image_file, feature): | ||
# [START detect_bounds] | ||
"""Returns document bounds given an image.""" | ||
vision_client = vision.Client() | ||
|
||
bounds = [] | ||
|
||
with io.open(image_file, 'rb') as image_file: | ||
content = image_file.read() | ||
|
||
image = vision_client.image(content=content) | ||
document = image.detect_full_text() | ||
|
||
# Append specified feature bounds by enumerating all document features | ||
for page in document.pages: | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. No need for these blank newlines in nested fors. |
||
for block in page.blocks: | ||
|
||
for paragraph in block.paragraphs: | ||
|
||
for word in paragraph.words: | ||
|
||
for symbol in word.symbols: | ||
|
||
if (feature == FeatureType.SYMBOL): | ||
bounds.append(symbol.bounding_box) | ||
|
||
if (feature == FeatureType.WORD): | ||
bounds.append(word.bounding_box) | ||
|
||
if (feature == FeatureType.PARA): | ||
bounds.append(paragraph.bounding_box) | ||
|
||
if (feature == FeatureType.BLOCK): | ||
bounds.append(block.bounding_box) | ||
|
||
if (feature == FeatureType.PAGE): | ||
bounds.append(block.bounding_box) | ||
|
||
return bounds | ||
# [END detect_bounds] | ||
|
||
|
||
def render_doc_text(filein, fileout): | ||
# [START render_doc_text] | ||
image = Image.open(filein) | ||
bounds = get_document_bounds(filein, FeatureType.PAGE) | ||
draw_boxes(image, bounds, 'blue') | ||
bounds = get_document_bounds(filein, FeatureType.PARA) | ||
draw_boxes(image, bounds, 'red') | ||
bounds = get_document_bounds(filein, FeatureType.WORD) | ||
draw_boxes(image, bounds, 'yellow') | ||
|
||
if fileout is not 0: | ||
image.save(fileout) | ||
else: | ||
image.show() | ||
# [END render_doc_text] | ||
|
||
|
||
if __name__ == '__main__': | ||
# [START run_crop] | ||
parser = argparse.ArgumentParser() | ||
parser.add_argument('detect_file', help='The image for text detection.') | ||
parser.add_argument('-out_file', help='Optional output file', default=0) | ||
args = parser.parse_args() | ||
|
||
parser = argparse.ArgumentParser() | ||
render_doc_text(args.detect_file, args.out_file) | ||
# [END run_crop] | ||
# [END full_tutorial] |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,24 @@ | ||
# Copyright 2017 Google Inc. All Rights Reserved. | ||
# | ||
# Licensed under the Apache License, Version 2.0 (the "License"); | ||
# you may not use this file except in compliance with the License. | ||
# You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
|
||
import os | ||
|
||
import doctext | ||
|
||
|
||
def test_text(cloud_config, capsys): | ||
"""Checks the output image for drawing the crop hint is created.""" | ||
doctext.render_doc_text('resources/text_menu.jpg', 'output-text.jpg') | ||
out, _ = capsys.readouterr() | ||
assert os.path.isfile('output-text.jpg') | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. add |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
google-cloud-vision==0.23.2 | ||
pillow==4.0.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant to ask this in the other sample, but why not use http://pillow.readthedocs.io/en/3.1.x/reference/ImageDraw.html#PIL.ImageDraw.PIL.ImageDraw.Draw.polygon ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh nice! For the boundaries here the objects might just work too because they're x,y tuples.