Skip to content

Commit

Permalink
[formrecognizer] add logic to set page_number on ContactNames field (
Browse files Browse the repository at this point in the history
…#14552)

* removing all the spots where we say US sales receipts now that locale is supported

* set page number on ContactNames

* move to helper function
  • Loading branch information
kristapratico authored Oct 16, 2020
1 parent 8e7562c commit 6a3f9d4
Show file tree
Hide file tree
Showing 14 changed files with 72 additions and 44 deletions.
6 changes: 3 additions & 3 deletions sdk/formrecognizer/azure-ai-formrecognizer/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ from form documents. It includes the following main functionalities:

* Custom models - Recognize field values and table data from forms. These models are trained with your own data, so they're tailored to your forms.
* Content API - Recognize text, table structures, and selection marks, along with their bounding box coordinates, from documents. Corresponds to the REST service's Layout API.
* Prebuilt receipt model - Recognize data from USA sales receipts using a prebuilt model.
* Prebuilt receipt model - Recognize data from sales receipts using a prebuilt model.
* Prebuilt business card model - Recognize data from business cards using a prebuilt model.

[Source code][python-fr-src] | [Package (PyPI)][python-fr-pypi] | [API reference documentation][python-fr-ref-docs]| [Product documentation][python-fr-product-docs] | [Samples][python-fr-samples]
Expand Down Expand Up @@ -132,7 +132,7 @@ form_recognizer_client = FormRecognizerClient(
`FormRecognizerClient` provides operations for:

- Recognizing form fields and content using custom models trained to recognize your custom forms. These values are returned in a collection of `RecognizedForm` objects.
- Recognizing common fields from US receipts, using a pre-trained receipt model. These fields and metadata are returned in a collection of `RecognizedForm` objects.
- Recognizing common fields from sales receipts, using a pre-trained receipt model. These fields and metadata are returned in a collection of `RecognizedForm` objects.
- Recognizing common fields from business cards, using a pre-trained business card model. These fields and metadata are returned in a collection of `RecognizedForm` objects.
- Recognizing form content, including tables, lines, words, and selection marks, without the need to train a model. Form content is returned in a collection of `FormPage` objects.

Expand Down Expand Up @@ -250,7 +250,7 @@ for selection_mark in page[0].selection_marks:
```

### Recognize Receipts
Recognize data from USA sales receipts using a prebuilt model. Receipt fields recognized by the service can be found [here][service_recognize_receipt].
Recognize data from sales receipts using a prebuilt model. Receipt fields recognized by the service can be found [here][service_recognize_receipt].

```python
from azure.ai.formrecognizer import FormRecognizerClient
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -64,22 +64,21 @@ class FormRecognizerClient(FormRecognizerClientBase):
:caption: Creating the FormRecognizerClient with a token credential.
"""

def _prebuilt_callback(self, raw_response, _, headers): # pylint: disable=unused-argument
def _prebuilt_callback(self, raw_response, _, headers, **kwargs): # pylint: disable=unused-argument
analyze_result = self._deserialize(self._generated_models.AnalyzeOperationResult, raw_response)
return prepare_prebuilt_models(analyze_result)
return prepare_prebuilt_models(analyze_result, **kwargs)

@distributed_trace
def begin_recognize_receipts(self, receipt, **kwargs):
# type: (Union[bytes, IO[bytes]], Any) -> LROPoller[List[RecognizedForm]]
"""Extract field text and semantic values from a given US sales receipt.
"""Extract field text and semantic values from a given sales receipt.
The input document must be of one of the supported content types - 'application/pdf',
'image/jpeg', 'image/png' or 'image/tiff'.
See fields found on a receipt here:
https://aka.ms/formrecognizer/receiptfields
:param receipt: JPEG, PNG, PDF and TIFF type file stream or bytes.
Currently only supports US sales receipts.
:type receipt: bytes or IO[bytes]
:keyword bool include_field_elements:
Whether or not to include field elements such as lines and words in addition to form fields.
Expand All @@ -106,7 +105,7 @@ def begin_recognize_receipts(self, receipt, **kwargs):
:end-before: [END recognize_receipts]
:language: python
:dedent: 8
:caption: Recognize US sales receipt fields.
:caption: Recognize sales receipt fields.
"""
locale = kwargs.pop("locale", None)
content_type = kwargs.pop("content_type", None)
Expand Down Expand Up @@ -137,15 +136,14 @@ def begin_recognize_receipts(self, receipt, **kwargs):
@distributed_trace
def begin_recognize_receipts_from_url(self, receipt_url, **kwargs):
# type: (str, Any) -> LROPoller[List[RecognizedForm]]
"""Extract field text and semantic values from a given US sales receipt.
"""Extract field text and semantic values from a given sales receipt.
The input document must be the location (URL) of the receipt to be analyzed.
See fields found on a receipt here:
https://aka.ms/formrecognizer/receiptfields
:param str receipt_url: The URL of the receipt to analyze. The input must be a valid, encoded URL
of one of the supported formats: JPEG, PNG, PDF and TIFF. Currently only supports
US sales receipts.
of one of the supported formats: JPEG, PNG, PDF and TIFF.
:keyword bool include_field_elements:
Whether or not to include field elements such as lines and words in addition to form fields.
:keyword int polling_interval: Waiting time between two polls for LRO operations
Expand All @@ -167,7 +165,7 @@ def begin_recognize_receipts_from_url(self, receipt_url, **kwargs):
:end-before: [END recognize_receipts_from_url]
:language: python
:dedent: 8
:caption: Recognize US sales receipt fields from a URL.
:caption: Recognize sales receipt fields from a URL.
"""
locale = kwargs.pop("locale", None)
include_field_elements = kwargs.pop("include_field_elements", False)
Expand Down Expand Up @@ -234,7 +232,9 @@ def begin_recognize_business_cards(
file_stream=business_card,
content_type=content_type,
include_text_details=include_field_elements,
cls=kwargs.pop("cls", self._prebuilt_callback),
cls=kwargs.pop("cls", lambda pipeline_response, _, response_headers: self._prebuilt_callback(
pipeline_response, _, response_headers, business_card=True
)),
polling=True,
**kwargs
)
Expand Down Expand Up @@ -279,7 +279,9 @@ def begin_recognize_business_cards_from_url(
return self._client.begin_analyze_business_card_async( # type: ignore
file_stream={"source": business_card_url},
include_text_details=include_field_elements,
cls=kwargs.pop("cls", self._prebuilt_callback),
cls=kwargs.pop("cls", lambda pipeline_response, _, response_headers: self._prebuilt_callback(
pipeline_response, _, response_headers, business_card=True
)),
polling=True,
**kwargs
)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -83,6 +83,18 @@ def adjust_text_angle(text_angle):
return text_angle


def adjust_page_number(value):
"""Adjusts the page number on the business card field
`ContactNames` to be set to the page number value found on `FirstName`
"""
for val in value.value_array:
if val.value_object.get("FirstName", None) and val.value_object.get("LastName", None):
if val.value_object["FirstName"].page == val.value_object["LastName"].page:
page_number = val.value_object["FirstName"].page
val.page = page_number
return value


def get_authentication_policy(credential):
authentication_policy = None
if credential is None:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@
adjust_value_type,
adjust_text_angle,
adjust_confidence,
adjust_page_number,
get_element
)

Expand All @@ -36,7 +37,7 @@ def resolve_element(element, read_result):
raise ValueError("Failed to parse element reference.")


def get_field_value(field, value, read_result): # pylint: disable=too-many-return-statements
def get_field_value(field, value, read_result, **kwargs): # pylint: disable=too-many-return-statements
if value is None:
return value
if value.type == "string":
Expand All @@ -52,13 +53,16 @@ def get_field_value(field, value, read_result): # pylint: disable=too-many-retu
if value.type == "time":
return value.value_time
if value.type == "array":
# business cards pre-built model doesn't return a page number for the `ContactNames` field
if "business_card" in kwargs and field == "ContactNames":
value = adjust_page_number(value)
return [
FormField._from_generated(field, value, read_result)
FormField._from_generated(field, value, read_result, **kwargs)
for value in value.value_array
]
if value.type == "object":
return {
key: FormField._from_generated(key, value, read_result)
key: FormField._from_generated(key, value, read_result, **kwargs)
for key, value in value.value_object.items()
}
if value.type == "selectionMark":
Expand Down Expand Up @@ -251,12 +255,12 @@ def __init__(self, **kwargs):
self.confidence = kwargs.get("confidence", None)

@classmethod
def _from_generated(cls, field, value, read_result):
def _from_generated(cls, field, value, read_result, **kwargs):
return cls(
value_type=adjust_value_type(value.type) if value else None,
label_data=None, # not returned with receipt/supervised
value_data=FieldData._from_generated(value, read_result),
value=get_field_value(field, value, read_result),
value=get_field_value(field, value, read_result, **kwargs),
name=field,
confidence=adjust_confidence(value.confidence) if value else None,
)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@
)


def prepare_prebuilt_models(response):
def prepare_prebuilt_models(response, **kwargs):
prebuilt_models = []
read_result = response.analyze_result.read_results
document_result = response.analyze_result.document_results
Expand All @@ -33,7 +33,7 @@ def prepare_prebuilt_models(response):
pages=form_page[page.page_range[0]-1:page.page_range[1]],
form_type=page.doc_type,
fields={
key: FormField._from_generated(key, value, read_result)
key: FormField._from_generated(key, value, read_result, **kwargs)
for key, value in page.fields.items()
} if page.fields else None,
form_type_confidence=page.doc_type_confidence,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -61,25 +61,24 @@ class FormRecognizerClient(FormRecognizerClientBaseAsync):
:caption: Creating the FormRecognizerClient with a token credential.
"""

def _prebuilt_callback(self, raw_response, _, headers): # pylint: disable=unused-argument
def _prebuilt_callback(self, raw_response, _, headers, **kwargs): # pylint: disable=unused-argument
analyze_result = self._deserialize(self._generated_models.AnalyzeOperationResult, raw_response)
return prepare_prebuilt_models(analyze_result)
return prepare_prebuilt_models(analyze_result, **kwargs)

@distributed_trace_async
async def begin_recognize_receipts(
self,
receipt: Union[bytes, IO[bytes]],
**kwargs: Any
) -> AsyncLROPoller[List[RecognizedForm]]:
"""Extract field text and semantic values from a given US sales receipt.
"""Extract field text and semantic values from a given sales receipt.
The input document must be of one of the supported content types - 'application/pdf',
'image/jpeg', 'image/png' or 'image/tiff'.
See fields found on a receipt here:
https://aka.ms/formrecognizer/receiptfields
:param receipt: JPEG, PNG, PDF and TIFF type file stream or bytes.
Currently only supports US sales receipts.
:type receipt: bytes or IO[bytes]
:keyword bool include_field_elements:
Whether or not to include field elements such as lines and words in addition to form fields.
Expand All @@ -106,7 +105,7 @@ async def begin_recognize_receipts(
:end-before: [END recognize_receipts_async]
:language: python
:dedent: 8
:caption: Recognize US sales receipt fields.
:caption: Recognize sales receipt fields.
"""
locale = kwargs.pop("locale", None)
content_type = kwargs.pop("content_type", None)
Expand Down Expand Up @@ -140,15 +139,14 @@ async def begin_recognize_receipts_from_url(
receipt_url: str,
**kwargs: Any
) -> AsyncLROPoller[List[RecognizedForm]]:
"""Extract field text and semantic values from a given US sales receipt.
"""Extract field text and semantic values from a given sales receipt.
The input document must be the location (URL) of the receipt to be analyzed.
See fields found on a receipt here:
https://aka.ms/formrecognizer/receiptfields
:param str receipt_url: The URL of the receipt to analyze. The input must be a valid, encoded URL
of one of the supported formats: JPEG, PNG, PDF and TIFF. Currently only supports
US sales receipts.
of one of the supported formats: JPEG, PNG, PDF and TIFF.
:keyword bool include_field_elements:
Whether or not to include field elements such as lines and words in addition to form fields.
:keyword int polling_interval: Waiting time between two polls for LRO operations
Expand All @@ -170,7 +168,7 @@ async def begin_recognize_receipts_from_url(
:end-before: [END recognize_receipts_from_url_async]
:language: python
:dedent: 8
:caption: Recognize US sales receipt fields from a URL.
:caption: Recognize sales receipt fields from a URL.
"""
locale = kwargs.pop("locale", None)

Expand Down Expand Up @@ -237,7 +235,9 @@ async def begin_recognize_business_cards(
file_stream=business_card,
content_type=content_type,
include_text_details=include_field_elements,
cls=kwargs.pop("cls", self._prebuilt_callback),
cls=kwargs.pop("cls", lambda pipeline_response, _, response_headers: self._prebuilt_callback(
pipeline_response, _, response_headers, business_card=True
)),
polling=True,
**kwargs
)
Expand Down Expand Up @@ -280,7 +280,9 @@ async def begin_recognize_business_cards_from_url(
return await self._client.begin_analyze_business_card_async( # type: ignore
file_stream={"source": business_card_url},
include_text_details=include_field_elements,
cls=kwargs.pop("cls", self._prebuilt_callback),
cls=kwargs.pop("cls", lambda pipeline_response, _, response_headers: self._prebuilt_callback(
pipeline_response, _, response_headers, business_card=True
)),
polling=True,
**kwargs
)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@
FILE: sample_recognize_receipts_async.py
DESCRIPTION:
This sample demonstrates how to recognize and extract common fields from US receipts,
This sample demonstrates how to recognize and extract common fields from receipts,
using a pre-trained receipt model. For a suggested approach to extracting information
from receipts, see sample_strongly_typed_recognized_form_async.py.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@
FILE: sample_recognize_receipts_from_url_async.py
DESCRIPTION:
This sample demonstrates how to recognize and extract common fields from a US receipt URL,
This sample demonstrates how to recognize and extract common fields from a receipt URL,
using a pre-trained receipt model. For a suggested approach to extracting information
from receipts, see sample_strongly_typed_recognized_form_async.py.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@
FILE: sample_recognize_receipts.py
DESCRIPTION:
This sample demonstrates how to recognize and extract common fields from US receipts,
This sample demonstrates how to recognize and extract common fields from receipts,
using a pre-trained receipt model. For a suggested approach to extracting information
from receipts, see sample_strongly_typed_recognized_form.py.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@
FILE: sample_recognize_receipts_from_url.py
DESCRIPTION:
This sample demonstrates how to recognize and extract common fields from a US receipt URL,
This sample demonstrates how to recognize and extract common fields from a receipt URL,
using a pre-trained receipt model. For a suggested approach to extracting information
from receipts, see sample_strongly_typed_recognized_form.py.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -242,6 +242,7 @@ def test_business_card_jpg(self, client):
business_card = result[0]
# check dict values
self.assertEqual(len(business_card.fields.get("ContactNames").value), 1)
self.assertEqual(business_card.fields.get("ContactNames").value[0].value_data.page_number, 1)
self.assertEqual(business_card.fields.get("ContactNames").value[0].value['FirstName'].value, 'Avery')
self.assertEqual(business_card.fields.get("ContactNames").value[0].value['LastName'].value, 'Smith')

Expand Down Expand Up @@ -285,6 +286,7 @@ def test_business_card_png(self, client):
business_card = result[0]
# check dict values
self.assertEqual(len(business_card.fields.get("ContactNames").value), 1)
self.assertEqual(business_card.fields.get("ContactNames").value[0].value_data.page_number, 1)
self.assertEqual(business_card.fields.get("ContactNames").value[0].value['FirstName'].value, 'Avery')
self.assertEqual(business_card.fields.get("ContactNames").value[0].value['LastName'].value, 'Smith')

Expand Down Expand Up @@ -330,8 +332,8 @@ def test_business_card_jpg_include_field_elements(self, client):
self.assertFormPagesHasValues(business_card.pages)

for name, field in business_card.fields.items():
if field.value_type not in ["list", "dictionary"]:
self.assertFieldElementsHasValues(field.value_data.field_elements, receipt.page_range.first_page_number)
for f in field.value:
self.assertFieldElementsHasValues(f.value_data.field_elements, business_card.page_range.first_page_number)

@GlobalFormRecognizerAccountPreparer()
@GlobalClientPreparer()
Expand Down
Loading

0 comments on commit 6a3f9d4

Please sign in to comment.