Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge Feature/formrecognizer2.1 into master #17121

Merged
merged 10 commits into from
Nov 20, 2020
10 changes: 8 additions & 2 deletions sdk/formrecognizer/Azure.AI.FormRecognizer/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,19 +3,25 @@
## 3.1.0-beta.1 (Unreleased)

### Breaking changes
- It defaults to the latest supported API version, which currently is `2.1-preview.1`.
- It defaults to the latest supported API version, which currently is `2.1-preview.2`.

### New Features
- Added support for pre-built business card recognition.
- Added support for pre-built invoices recognition.
- Added support for providing locale info when recognizing receipts and business cards. Supported locales include support EN-US, EN-AU, EN-CA, EN-GB, EN-IN.
- Added support for providing the document language in `StartRecognizeContent` when recognizing a form.
- Added support to train and recognize custom forms with selection marks such as check boxes and radio buttons. This functionality is only available in train with labels scenarios.
- Added support to `StartRecognizeContent` to recognize selection marks such as check boxes and radio buttons.
- Added ability to create a composed model from the `FormTrainingClient` by calling method `StartCreateComposedModel`.
- Added ability to pass parameter `ModelName` to `StartTraining` methods.
- Added the properties `ModelName` and `Properties` to types `CustomFormModel` and `CustomFormModelInfo`.
- Added type `CustomFormModelProperties` that includes information like if a model is a composed model.
- Added property `ModelId` to `CustomFormSubmodel` and `TrainingDocumentInfo`.
- Added properties `ModelId` and `FormTypeConfidence` to `RecognizedForm`.
- Added support to `StartRecognizeContent` to recognize selection marks such as check boxes and radio buttons.
- Added property `Appearance` to `FormLine` to indicate the style of the extracted text. for example, "handwriting" or "other".
- Added property `BoundingBox` to `FormTable`.
- Added support for `ContentType` `image/bmp` in recognize content and prebuilt models.
- Added property `Pages` to `RecognizeContentOptions` to specify the page numbers to analyze.

## 3.0.0 (2020-08-20)

Expand Down
66 changes: 59 additions & 7 deletions sdk/formrecognizer/Azure.AI.FormRecognizer/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,10 @@ Azure Cognitive Services Form Recognizer is a cloud service that uses machine le

- Recognize Custom Forms - Recognize and extract form fields and other content from your custom forms, using models you trained with your own form types.
- Recognize Form Content - Recognize and extract tables, lines, words, and selection marks like radio buttons and check boxes in forms documents, without the need to train a model.
- Recognize Receipts - Recognize and extract common fields from receipts, using a pre-trained receipt model.
- Recognize Business Card - Recognize and extract common fields from business cards, using a pre-trained business cards model.
- Recognize Prebuilt models - Recognize data using the following prebuilt models:
- Receipts - Recognize and extract common fields from receipts, using a pre-trained receipt model.
- Business Cards - Recognize and extract common fields from business cards, using a pre-trained business cards model.
- Invoices - Recognize and extract common fields from invoices, using a pre-trained invoice model.

[Source code][formreco_client_src] | [Package (NuGet)][formreco_nuget_package] | [API reference documentation][formreco_refdocs] | [Product documentation][formreco_docs] | [Samples][formreco_samples]

Expand Down Expand Up @@ -101,10 +103,12 @@ var client = new FormRecognizerClient(new Uri(endpoint), new DefaultAzureCredent

`FormRecognizerClient` provides operations for:

- Recognizing form fields and content, using custom models trained to recognize your custom forms. These values are returned in a collection of `RecognizedForm` objects. See example [Recognize Custom Forms](#recognize-custom-forms).
- Recognizing form content, including tables, lines, words, and selection marks like radio buttons and check boxes without the need to train a model. Form content is returned in a collection of `FormPage` objects. See example [Recognize Content](#recognize-content).
- Recognizing common fields from receipts, using a pre-trained receipt model on the Form Recognizer service. These fields and meta-data are returned in a collection of `RecognizedForm` objects. See example [Recognize Receipts](#recognize-receipts).
- Recognizing common fields from business cards, using a pre-trained business cards model on the Form Recognizer service. These fields and meta-data are returned in a collection of `RecognizedForm` objects. See example [Recognize Business Cards](#recognize-business-cards).
- Recognizing form fields and content, using custom models trained to recognize your custom forms. These values are returned in a collection of `RecognizedForm` objects. See example [Recognize Custom Forms](#recognize-custom-forms).
- Recognizing form content, including tables, lines, words, and selection marks like radio buttons and check boxes without the need to train a model. Form content is returned in a collection of `FormPage` objects. See example [Recognize Content](#recognize-content).
- Recognizing common fields from the following form types using prebuilt models. These fields and meta-data are returned in a collection of `RecognizedForm` objects.
- Sales receipts. See example [Recognize Receipts](#recognize-receipts).
- Business cards. See example [Recognize Business Cards](#recognize-business-cards).
- Invoices. See example [Recognize Invoices](#recognize-invoices).

### FormTrainingClient

Expand Down Expand Up @@ -134,6 +138,7 @@ The following section provides several code snippets illustrating common pattern
* [Recognize Custom Forms](#recognize-custom-forms)
* [Recognize Receipts](#recognize-receipts)
* [Recognize Business Cards](#recognize-business-cards)
* [Recognize Invoices](#recognize-invoices)
* [Train a Model](#train-a-model)
* [Manage Custom Models](#manage-custom-models)

Expand Down Expand Up @@ -299,7 +304,7 @@ using (FileStream stream = new FileStream(receiptPath, FileMode.Open))
Recognize data from business cards using a prebuilt model. Business card fields recognized by the service can be found [here][service_recognize_business_cards_fields].

```C# Snippet:FormRecognizerSampleRecognizeBusinessCardFileStream
using (FileStream stream = new FileStream(busienssCardsPath, FileMode.Open))
using (FileStream stream = new FileStream(businessCardsPath, FileMode.Open))
{
var options = new RecognizeBusinessCardsOptions() { Locale = "en-US" };
RecognizedFormCollection businessCards = await client.StartRecognizeBusinessCardsAsync(stream, options).WaitForCompletionAsync();
Expand Down Expand Up @@ -368,6 +373,52 @@ using (FileStream stream = new FileStream(busienssCardsPath, FileMode.Open))
}
```

### Recognize Invoices
Recognize data from invoices using a prebuilt model. Invoices fields recognized by the service can be found [here][service_recognize_invoices_fields].

```C# Snippet:FormRecognizerSampleRecognizeInvoicesFileStream
using (FileStream stream = new FileStream(invoicePath, FileMode.Open))
{
var options = new RecognizeInvoicesOptions() { Locale = "en-US" };
RecognizedFormCollection invoices = await client.StartRecognizeInvoicesAsync(stream, options).WaitForCompletionAsync();

// To see the list of the supported fields returned by service and its corresponding types, consult:
// https://aka.ms/formrecognizer/invoicefields

RecognizedForm invoice = invoices.Single();

FormField vendorNameField;
if (invoice.Fields.TryGetValue("VendorName", out vendorNameField))
{
if (vendorNameField.Value.ValueType == FieldValueType.String)
{
string vendorName = vendorNameField.Value.AsString();
Console.WriteLine($" Vendor Name: '{vendorName}', with confidence {vendorNameField.Confidence}");
}
}

FormField customerNameField;
if (invoice.Fields.TryGetValue("CustomerName", out customerNameField))
{
if (customerNameField.Value.ValueType == FieldValueType.String)
{
string customerName = customerNameField.Value.AsString();
Console.WriteLine($" Customer Name: '{customerName}', with confidence {customerNameField.Confidence}");
}
}

FormField invoiceTotalField;
if (invoice.Fields.TryGetValue("InvoiceTotal", out invoiceTotalField))
{
if (invoiceTotalField.Value.ValueType == FieldValueType.Float)
{
float invoiceTotal = invoiceTotalField.Value.AsFloat();
Console.WriteLine($" Invoice Total: '{invoiceTotal}', with confidence {invoiceTotalField.Confidence}");
}
}
}
```

### Train a Model
Train a machine-learned model on your own form types. The resulting model will be able to recognize values from the types of forms it was trained on.

Expand Down Expand Up @@ -602,6 +653,7 @@ This project has adopted the [Microsoft Open Source Code of Conduct][code_of_con
[labeling_tool]: https://docs.microsoft.com/azure/cognitive-services/form-recognizer/quickstarts/label-tool
[service_recognize_receipt_fields]: https://aka.ms/formrecognizer/receiptfields
[service_recognize_business_cards_fields]: https://aka.ms/formrecognizer/businesscardfields
[service_recognize_invoices_fields]: https://aka.ms/formrecognizer/invoicefields
[dotnet_lro_guidelines]: https://azure.github.io/azure-sdk/dotnet_introduction.html#dotnet-longrunning

[logging]: https://github.com/Azure/azure-sdk-for-net/tree/master/sdk/core/Azure.Core/samples/Diagnostics.md
Expand Down
Loading