Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[translation] simplify input to translation #19060

Merged
merged 15 commits into from
Jun 3, 2021
3 changes: 3 additions & 0 deletions sdk/translation/azure-ai-translation-document/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,9 @@ translation has completed.
- Authentication using `azure-identity` credentials now supported.
- see the [Azure Identity documentation](https://github.com/Azure/azure-sdk-for-python/blob/master/sdk/identity/azure-identity/README.md) for more information.
- Added paging and filtering options to `list_all_document_statuses` and `list_submitted_jobs`.
- The input to `begin_translation` now accepts either the parameter `inputs` as a `List[DocumentTranslationInput]` to
perform multiple translations, or the parameters `source_url`, `target_url`, and `target_language_code` to perform a
single translation of your documents.

**Dependency updates**

Expand Down
84 changes: 62 additions & 22 deletions sdk/translation/azure-ai-translation-document/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -141,26 +141,21 @@ To begin translating your documents, pass a list of `DocumentTranslationInput` i
Constructing a `DocumentTranslationInput` requires that you pass the SAS URLs to your source and target containers (or files)
and the target language(s) for translation.

A single source container with documents can be translated to many different languages:
A single source container with documents can be translated to a different language:

```python
from azure.ai.translation.document import DocumentTranslationInput, TranslationTarget
from azure.core.credentials import AzureKeyCredential
from azure.ai.translation.document import DocumentTranslationClient

my_input = [
DocumentTranslationInput(
source_url="<sas_url_to_source>",
targets=[
TranslationTarget(target_url="<sas_url_to_target_fr>", language_code="fr"),
TranslationTarget(target_url="<sas_url_to_target_de>", language_code="de")
]
)
]
document_translation_client = DocumentTranslationClient("<endpoint>", AzureKeyCredential("<api_key>"))
poller = document_translation_client.begin_translation("<sas_url_to_source>", "<sas_url_to_target", "fr")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: do you think it is worth creating a variable language_code="fr" and then use that variable so the user understands what fr is? or maybe just follow the same pattern you have and do <language_code>

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

saw we do this in .NET too. If you agree I can create an issue for .NET. If you think it is not important will leave as is :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was trying to demonstrate that these can be passed positionally. I think I'll update to "<target_language_code>" here to be consistent. Thanks for pointing that out!

```

Or multiple different sources can be provided each with their own targets.

```python
from azure.ai.translation.document import DocumentTranslationInput, TranslationTarget
from azure.core.credentials import AzureKeyCredential
from azure.ai.translation.document import DocumentTranslationClient, DocumentTranslationInput, TranslationTarget

my_input = [
DocumentTranslationInput(
Expand All @@ -185,6 +180,9 @@ my_input = [
]
)
]

document_translation_client = DocumentTranslationClient("<endpoint>", AzureKeyCredential("<api_key>"))
poller = document_translation_client.begin_translation(my_input)
```

> Note: the target_url for each target language must be unique.
Expand All @@ -206,20 +204,61 @@ Sample code snippets are provided to illustrate using long-running operations [b
The following section provides several code snippets covering some of the most common Document Translation tasks, including:

* [Translate your documents](#translate-your-documents "Translate Your Documents")
* [Translate multiple inputs](#translate-multiple-inputs "Translate Multiple Inputs")
* [List translation operations](#list-translation-operations "List Translation Operations")

### Translate your documents
Translate the documents in your source container to the target containers.
Translate the documents in your source container to the target container.

```python
from azure.core.credentials import AzureKeyCredential
from azure.ai.translation.document import DocumentTranslationClient

endpoint = "https://<resource-name>.cognitiveservices.azure.com/"
credential = AzureKeyCredential("<api_key>")
source_container_sas_url_en = "<sas-url-en>"
target_container_sas_url_es = "<sas-url-es>"

document_translation_client = DocumentTranslationClient(endpoint, credential)

poller = document_translation_client.begin_translation(source_container_sas_url_en, target_container_sas_url_es, "es")

result = poller.result()

print("Status: {}".format(poller.status()))
print("Created on: {}".format(poller.details.created_on))
print("Last updated on: {}".format(poller.details.last_updated_on))
print("Total number of translations on documents: {}".format(poller.details.documents_total_count))

print("\nOf total documents...")
print("{} failed".format(poller.details.documents_failed_count))
print("{} succeeded".format(poller.details.documents_succeeded_count))

for document in result:
print("Document ID: {}".format(document.id))
print("Document status: {}".format(document.status))
if document.status == "Succeeded":
print("Source document location: {}".format(document.source_document_url))
print("Translated document location: {}".format(document.translated_document_url))
print("Translated to language: {}\n".format(document.translate_to))
else:
print("Error Code: {}, Message: {}\n".format(document.error.code, document.error.message))
```

### Translate multiple inputs
Begin translating with documents in multiple source containers to multiple target containers in different languages.

```python
from azure.core.credentials import AzureKeyCredential
from azure.ai.translation.document import DocumentTranslationClient, DocumentTranslationInput, TranslationTarget

endpoint = "https://<resource-name>.cognitiveservices.azure.com/"
credential = AzureKeyCredential("<api_key>")
source_container_sas_url_de = "<sas-url-de>"
source_container_sas_url_en = "<sas-url-en>"
target_container_sas_url_es = "<sas-url-es>"
target_container_sas_url_fr = "<sas-url-fr>"
target_container_sas_url_ar = "<sas-url-fr>"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
target_container_sas_url_ar = "<sas-url-fr>"
target_container_sas_url_ar = "<sas-url-ar>"


document_translation_client = DocumentTranslationClient(endpoint, credential)

Expand All @@ -231,21 +270,18 @@ poller = document_translation_client.begin_translation(
TranslationTarget(target_url=target_container_sas_url_es, language_code="es"),
TranslationTarget(target_url=target_container_sas_url_fr, language_code="fr"),
],
),
DocumentTranslationInput(
source_url=source_container_sas_url_de,
targets=[
TranslationTarget(target_url=target_container_sas_url_ar, language_code="ar"),
],
)
]
)

result = poller.result()

print("Status: {}".format(poller.status()))
print("Created on: {}".format(poller.details.created_on))
print("Last updated on: {}".format(poller.details.last_updated_on))
print("Total number of translations on documents: {}".format(poller.details.documents_total_count))

print("\nOf total documents...")
print("{} failed".format(poller.details.documents_failed_count))
print("{} succeeded".format(poller.details.documents_succeeded_count))

for document in result:
print("Document ID: {}".format(document.id))
print("Document status: {}".format(document.status))
Expand Down Expand Up @@ -321,6 +357,7 @@ These code samples show common scenario operations with the Azure Document Trans

* Client authentication: [sample_authentication.py][sample_authentication]
* Begin translating documents: [sample_begin_translation.py][sample_begin_translation]
* Translate with multiple inputs: [sample_translate_multiple_inputs.py][sample_translate_multiple_inputs]
* Check the status of documents: [sample_check_document_statuses.py][sample_check_document_statuses]
* List all submitted translation jobs: [sample_list_all_submitted_jobs.py][sample_list_all_submitted_jobs]
* Apply a custom glossary to translation: [sample_translation_with_glossaries.py][sample_translation_with_glossaries]
Expand All @@ -334,6 +371,7 @@ are found under the `azure.ai.translation.document.aio` namespace.

* Client authentication: [sample_authentication_async.py][sample_authentication_async]
* Begin translating documents: [sample_begin_translation_async.py][sample_begin_translation_async]
* Translate with multiple inputs: [sample_translate_multiple_inputs_async.py][sample_translate_multiple_inputs_async]
* Check the status of documents: [sample_check_document_statuses_async.py][sample_check_document_statuses_async]
* List all submitted translation jobs: [sample_list_all_submitted_jobs_async.py][sample_list_all_submitted_jobs_async]
* Apply a custom glossary to translation: [sample_translation_with_glossaries_async.py][sample_translation_with_glossaries_async]
Expand Down Expand Up @@ -390,6 +428,8 @@ This project has adopted the [Microsoft Open Source Code of Conduct][code_of_con
[sample_authentication_async]: https://github.com/Azure/azure-sdk-for-python/tree/master/sdk/translation/azure-ai-translation-document/samples/async_samples/sample_authentication_async.py
[sample_begin_translation]: https://github.com/Azure/azure-sdk-for-python/tree/master/sdk/translation/azure-ai-translation-document/samples/sample_begin_translation.py
[sample_begin_translation_async]: https://github.com/Azure/azure-sdk-for-python/tree/master/sdk/translation/azure-ai-translation-document/samples/async_samples/sample_begin_translation_async.py
[sample_translate_multiple_inputs]: https://github.com/Azure/azure-sdk-for-python/tree/master/sdk/translation/azure-ai-translation-document/samples/sample_translate_multiple_inputs.py
[sample_translate_multiple_inputs_async]: https://github.com/Azure/azure-sdk-for-python/tree/master/sdk/translation/azure-ai-translation-document/samples/async_samples/sample_translate_multiple_inputs_async.py
[sample_check_document_statuses]: https://github.com/Azure/azure-sdk-for-python/tree/master/sdk/translation/azure-ai-translation-document/samples/sample_check_document_statuses.py
[sample_check_document_statuses_async]: https://github.com/Azure/azure-sdk-for-python/tree/master/sdk/translation/azure-ai-translation-document/samples/async_samples/sample_check_document_statuses_async.py
[sample_list_all_submitted_jobs]: https://github.com/Azure/azure-sdk-for-python/tree/master/sdk/translation/azure-ai-translation-document/samples/sample_list_all_submitted_jobs.py
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,14 @@
# ------------------------------------

import json
from typing import Any, TYPE_CHECKING, List, Union
from typing import Any, TYPE_CHECKING, List, Union, overload
from azure.core.tracing.decorator import distributed_trace
from ._generated import BatchDocumentTranslationClient as _BatchDocumentTranslationClient
from ._generated.models import (
BatchRequest as _BatchRequest,
SourceInput as _SourceInput,
TargetInput as _TargetInput,
)
from ._models import (
JobStatusResult,
DocumentStatusResult,
Expand Down Expand Up @@ -89,15 +94,32 @@ def close(self):
"""Close the :class:`~azure.ai.translation.document.DocumentTranslationClient` session."""
return self._client.close()

@distributed_trace
@overload
def begin_translation(self, source_url, target_url, target_language_code, **kwargs):
# type: (str, str, str, **Any) -> DocumentTranslationPoller[ItemPaged[DocumentStatusResult]]
pass

@overload
def begin_translation(self, inputs, **kwargs):
# type: (List[DocumentTranslationInput], **Any) -> DocumentTranslationPoller[ItemPaged[DocumentStatusResult]]
"""Begin translating the document(s) in your source container to your TranslationTarget(s)
in the given language.
pass

def begin_translation(self, *args, **kwargs): # pylint: disable=client-method-missing-type-annotations
"""Begin translating the document(s) in your source container to your target container
in the given language. To perform a single translation from source to target, pass the `source_url`,
`target_url`, and `target_language_code` parameters. To pass multiple inputs for translation, including
other translation options, pass the `inputs` parameter as a list of DocumentTranslationInput.

For supported languages and document formats, see the service documentation:
https://docs.microsoft.com/azure/cognitive-services/translator/document-translation/overview

:param str source_url: The source SAS URL to the Azure Blob container containing the documents
to be translated. Requires read and list permissions at the minimum.
:param str target_url: The target SAS URL to the Azure Blob container where the translated documents
should be written. Requires write and list permissions at the minimum.
:param str target_language_code: This is the language you want your documents to be translated to.
See supported language codes here:
https://docs.microsoft.com/azure/cognitive-services/translator/language-support#translate
:param inputs: A list of translation inputs. Each individual input has a single
source URL to documents and can contain multiple TranslationTargets (one for each language)
for the destination to write translated documents.
Expand All @@ -118,6 +140,39 @@ def begin_translation(self, inputs, **kwargs):
:caption: Translate the documents in your storage container.
"""

continuation_token = kwargs.pop("continuation_token", None)

try:
inputs = kwargs.pop('inputs', None)
if not inputs:
inputs = args[0]
inputs = DocumentTranslationInput._to_generated_list(inputs) \
if not continuation_token else None # pylint: disable=protected-access
except (AttributeError, TypeError, IndexError):
try:
source_url = kwargs.pop('source_url', None)
if not source_url:
source_url = args[0]
target_url = kwargs.pop("target_url", None)
if not target_url:
target_url = args[1]
target_language_code = kwargs.pop("target_language_code", None)
if not target_language_code:
target_language_code = args[2]
inputs = [
_BatchRequest(
source=_SourceInput(
source_url=source_url
),
targets=[_TargetInput(
target_url=target_url,
language=target_language_code
)]
)
]
except (AttributeError, TypeError, IndexError):
raise ValueError("Pass either 'inputs' or 'source_url', 'target_url', and 'target_language_code'")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small nit: in the response you can say pass "inputs" for multiple inputs, or these for single


def deserialization_callback(
raw_response, _, headers
): # pylint: disable=unused-argument
Expand All @@ -128,7 +183,6 @@ def deserialization_callback(
"polling_interval", self._client._config.polling_interval # pylint: disable=protected-access
)

continuation_token = kwargs.pop("continuation_token", None)
pipeline_response = None
if continuation_token:
pipeline_response = self._client.document_translation.get_translation_status(
Expand All @@ -138,8 +192,7 @@ def deserialization_callback(

callback = kwargs.pop("cls", deserialization_callback)
return self._client.document_translation.begin_start_translation(
inputs=DocumentTranslationInput._to_generated_list(inputs) # pylint: disable=protected-access
if not continuation_token else None,
inputs=inputs if not continuation_token else None,
polling=DocumentTranslationLROPollingMethod(
timeout=polling_interval,
lro_algorithms=[
Expand Down
Loading