Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Schema Registry + Avro Serializer] 1.0.0b1 #13124

Merged
merged 36 commits into from
Sep 4, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
36 commits
Select commit Hold shift + click to select a range
56bf5eb
init commit
yunhaoling Aug 14, 2020
d1c96f9
avro serializer structure
yunhaoling Aug 18, 2020
6311e92
adding avro serializer
yunhaoling Aug 20, 2020
0096ef2
tweak api version and fix a typo
yunhaoling Aug 20, 2020
2e95001
test template
yunhaoling Aug 21, 2020
2e8bcc7
avro serializer sync draft
yunhaoling Aug 22, 2020
6248ff1
major azure sr client work done
yunhaoling Aug 25, 2020
f97478e
add sample docstring for sr
yunhaoling Aug 25, 2020
3cf6459
avro serializer async impl
yunhaoling Aug 26, 2020
58fb59f
close the writer
yunhaoling Aug 26, 2020
02c60ec
update avro se/de impl
yunhaoling Aug 27, 2020
85d6766
update avro serializer impl
yunhaoling Aug 27, 2020
5fa4b43
fix apireview reported error in sr
yunhaoling Aug 27, 2020
b910027
srav namespace, setup update
yunhaoling Aug 27, 2020
6b6c8b2
doc update
yunhaoling Aug 28, 2020
c465be1
update doc and api
yunhaoling Aug 30, 2020
63a278c
impl, doc update
yunhaoling Aug 31, 2020
dc363f4
partial update according to laruent's feedback
yunhaoling Sep 1, 2020
740de0e
be consistent with eh extension structure
yunhaoling Sep 1, 2020
7734c42
more update code according to feedback
yunhaoling Sep 1, 2020
92cd385
update credential config
yunhaoling Sep 1, 2020
1c60676
rename package name to azure-schemaregistry-avroserializer
yunhaoling Sep 1, 2020
f20bba0
fix pylint
yunhaoling Sep 1, 2020
c81f16b
try ci fix
yunhaoling Sep 2, 2020
41ee64b
fix test for py27 as avro only accept unicode
yunhaoling Sep 3, 2020
2675331
first round of review feedback
yunhaoling Sep 3, 2020
fb0e6f9
remove temp ci experiment
yunhaoling Sep 3, 2020
0260ea3
init add conftest.py to pass py2.7 test
yunhaoling Sep 3, 2020
bb687cb
laurent feedback update
yunhaoling Sep 3, 2020
d8e0986
remove dictmixin for b1, update comment in sample
yunhaoling Sep 3, 2020
b91fb4f
update api in avroserializer and update test and readme
yunhaoling Sep 4, 2020
929ee68
update test, docs and links
yunhaoling Sep 4, 2020
8fbed90
add share requirement
yunhaoling Sep 4, 2020
01a39a7
update avro dependency
yunhaoling Sep 4, 2020
bde3c24
pr feedback and livetest update
yunhaoling Sep 4, 2020
a2903b6
Merge remote-tracking branch 'central/master' into sr-dev
yunhaoling Sep 4, 2020
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 10 additions & 1 deletion eng/ignore-links.txt
Original file line number Diff line number Diff line change
@@ -1 +1,10 @@
https://docs.microsoft.com/python/api/overview/azure/{{package_doc_id}}
https://docs.microsoft.com/python/api/overview/azure/{{package_doc_id}}
https://pypi.org/project/azure-schemaregistry
https://azuresdkdocs.blob.core.windows.net/$web/python/azure-schemaregistry/latest/index.html
https://github.com/Azure/azure-sdk-for-python/tree/master/sdk/schemaregistry/azure-schemaregistry/samples/sync_samples/schema_registry.py
https://github.com/Azure/azure-sdk-for-python/tree/master/sdk/schemaregistry/azure-schemaregistry/samples/async_samples/schema_registry_async.py
https://pypi.org/project/azure-schemaregistry-avroserializer
https://github.com/Azure/azure-sdk-for-python/tree/master/sdk/schemaregistry/azure-schemaregistry-avroserializer/CHANGELOG.md
https://azuresdkdocs.blob.core.windows.net/$web/python/azure-schemaregistry-avroserializer/latest/index.html
https://github.com/Azure/azure-sdk-for-python/tree/master/sdk/schemaregistry/azure-schemaregistry-avroserializer/samples
https://github.com/Azure/azure-sdk-for-python/tree/master/sdk/schemaregistry/azure-schemaregistry-avroserializer/samples/avro_serializer.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
# Release History

## 1.0.0b1 (2020-09-08)

Version 1.0.0b1 is the first preview of our efforts to create a user-friendly and Pythonic client library for Azure Schema Registry Avro Serializer.

**New features**

- `SchemaRegistryAvroSerializer` is the top-level client class that provides the functionality to encode and decode avro data utilizing the avro library. It will automatically register schema and retrieve schema from Azure Schema Registry Service. It provides two methods:
- `serialize`: Serialize dict data into bytes according to the given schema and register schema if needed.
- `deserialize`: Deserialize bytes data into dict data by automatically retrieving schema from the service.
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
include *.md
include azure/__init__.py
yunhaoling marked this conversation as resolved.
Show resolved Hide resolved
include azure/schemaregistry/__init__.py
include azure/schemaregistry/serializer/__init__.py
recursive-include tests *.py
recursive-include samples *.py
174 changes: 174 additions & 0 deletions sdk/schemaregistry/azure-schemaregistry-avroserializer/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,174 @@
# Azure Schema Registry Avro Serializer client library for Python

Azure Schema Registry Avro Serializer provides the ability to serialize and deserialize data according
to the given avro schema. It is integrated with Azure Schema Registry SDK and will automatically register and get schema.

[Source code][source_code] | [Package (PyPi)][pypi] | [API reference documentation][api_docs] | [Samples][sr_avro_samples] | [Changelog][change_log]

## Getting started

### Install the package

Install the Azure Schema Registry Avro Serializer client library and Azure Identity client library for Python with [pip][pip]:

```Bash
pip install azure-schemaregistry-avroserializer azure-identity
```

### Prerequisites:
To use this package, you must have:
* Azure subscription - [Create a free account][azure_sub]
* Azure Schema Registry
* Python 2.7, 3.5 or later - [Install Python][python]

### Authenticate the client
Interaction with Schema Registry Avro Serializer starts with an instance of SchemaRegistryAvroSerializer class. You need the endpoint, AAD credential and schema group name to instantiate the client object.

**Create client using the azure-identity library:**
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got feedback from cala; we're not supposed to have samples in these sections, we could make it a formal sample and link to it however, which is what I've leaned towards. Feel free to eyeball my current SB PR for an example of this.


```python
from azure.schemaregistry import SchemaRegistryClient
from azure.schemaregistry.serializer.avroserializer import SchemaRegistryAvroSerializer
from azure.identity import DefaultAzureCredential

credential = DefaultAzureCredential()
endpoint = '<< ENDPOINT OF THE SCHEMA REGISTRY >>'
schema_group = '<< GROUP NAME OF THE SCHEMA >>'
schema_registry_client = SchemaRegistryClient(endpoint, credential)
serializer = SchemaRegistryAvroSerializer(schema_registry_client, schema_group)
```

## Key concepts

- Avro: Apache Avro™ is a data serialization system.

## Examples

The following sections provide several code snippets covering some of the most common Schema Registry tasks, including:

- [Serialization](#serialization)
- [Deserialization](#deserialization)

### Serialization

```python
import os
from azure.schemaregistry import SchemaRegistryClient
from azure.schemaregistry.serializer.avroserializer import SchemaRegistryAvroSerializer
from azure.identity import DefaultAzureCredential

token_credential = DefaultAzureCredential()
endpoint = os.environ['SCHEMA_REGISTRY_ENDPOINT']
schema_group = "<your-group-name>"

schema_registry_client = SchemaRegistryClient(endpoint, token_credential)
serializer = SchemaRegistryAvroSerializer(schema_registry_client, schema_group)

schema_string = """
{"namespace": "example.avro",
"type": "record",
"name": "User",
"fields": [
{"name": "name", "type": "string"},
{"name": "favorite_number", "type": ["int", "null"]},
{"name": "favorite_color", "type": ["string", "null"]}
]
}"""

with serializer:
dict_data = {"name": "Ben", "favorite_number": 7, "favorite_color": "red"}
encoded_bytes = serializer.serialize(dict_data, schema_string)
```
Copy link
Member

@lmazuel lmazuel Sep 2, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For clarity, I would add another quick code block for EventHub usage, since it's the main planned usage (to connect the dot). Something really simple, with a "Look at EH doc for more details about EH"

I would make it a difference code block to be sure people would not assume this package is EH specific somehow


### Deserialization

```python
import os
from azure.schemaregistry import SchemaRegistryClient
from azure.schemaregistry.serializer.avroserializer import SchemaRegistryAvroSerializer
from azure.identity import DefaultAzureCredential

token_credential = DefaultAzureCredential()
endpoint = os.environ['SCHEMA_REGISTRY_ENDPOINT']
schema_group = "<your-group-name>"

schema_registry_client = SchemaRegistryClient(endpoint, token_credential)
serializer = SchemaRegistryAvroSerializer(schema_registry_client, schema_group)

with serializer:
encoded_bytes = b'<data_encoded_by_azure_schema_registry_avro_serializer>'
decoded_data = serializer.deserialize(encoded_bytes)
```

## Troubleshooting

### General

Azure Schema Registry Avro Serializer raise exceptions defined in [Azure Core][azure_core].
yunhaoling marked this conversation as resolved.
Show resolved Hide resolved

### Logging
This library uses the standard
[logging][python_logging] library for logging.
Basic information about HTTP sessions (URLs, headers, etc.) is logged at INFO
level.

Detailed DEBUG level logging, including request/response bodies and unredacted
headers, can be enabled on a client with the `logging_enable` argument:
```python
import sys
import logging
from azure.schemaregistry import SchemaRegistryClient
from azure.schemaregistry.serializer.avroserializer import SchemaRegistryAvroSerializer
from azure.identity import DefaultAzureCredential

# Create a logger for the SDK
logger = logging.getLogger('azure.schemaregistry')
logger.setLevel(logging.DEBUG)

# Configure a console output
handler = logging.StreamHandler(stream=sys.stdout)
logger.addHandler(handler)

credential = DefaultAzureCredential()
schema_registry_client = SchemaRegistryClient("<your-end-point>", credential)
# This client will log detailed information about its HTTP sessions, at DEBUG level
serializer = SchemaRegistryAvroSerializer(schema_registry_client, "<your-group-name>", logging_enable=True)
```

Similarly, `logging_enable` can enable detailed logging for a single operation,
even when it isn't enabled for the client:
```py
serializer.serialie(dict_data, schema_content, logging_enable=True)
```

## Next steps

### More sample code

Please find further examples in the [samples][sr_avro_samples] directory demonstrating common Azure Schema Registry Avro Serializer scenarios.

## Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a
Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us
the rights to use your contribution. For details, visit https://cla.microsoft.com.

When you submit a pull request, a CLA-bot will automatically determine whether you need to provide
a CLA and decorate the PR appropriately (e.g., label, comment). Simply follow the instructions
provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/).
For more information see the [Code of Conduct FAQ](https://opensource.microsoft.com/codeofconduct/faq/) or
contact [opencode@microsoft.com](mailto:opencode@microsoft.com) with any additional questions or comments.

<!-- LINKS -->
[pip]: https://pypi.org/project/pip/
[pypi]: https://pypi.org/project/azure-schemaregistry-avroserializer
[python]: https://www.python.org/downloads/
[azure_core]: https://github.com/Azure/azure-sdk-for-python/blob/master/sdk/core/azure-core/README.md
[azure_sub]: https://azure.microsoft.com/free/
[python_logging]: https://docs.python.org/3/library/logging.html
[sr_avro_samples]: https://github.com/Azure/azure-sdk-for-python/tree/master/sdk/schemaregistry/azure-schemaregistry-avroserializer/samples
[api_reference]: https://azuresdkdocs.blob.core.windows.net/$web/python/azure-schemaregistry-avroserializer/latest/index.html
[source_code]: https://github.com/Azure/azure-sdk-for-python/tree/master/sdk/schemaregistry/azure-schemaregistry-avroserializer
[change_log]: https://github.com/Azure/azure-sdk-for-python/tree/master/sdk/schemaregistry/azure-schemaregistry-avroserializer/CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
# --------------------------------------------------------------------------
#
# Copyright (c) Microsoft Corporation. All rights reserved.
#
# The MIT License (MIT)
#
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the ""Software""), to
# deal in the Software without restriction, including without limitation the
# rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
# sell copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be included in
# all copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED *AS IS*, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
# IN THE SOFTWARE.
#
# --------------------------------------------------------------------------
__path__ = __import__("pkgutil").extend_path(__path__, __name__) # type: ignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
# --------------------------------------------------------------------------
#
# Copyright (c) Microsoft Corporation. All rights reserved.
#
# The MIT License (MIT)
#
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the ""Software""), to
# deal in the Software without restriction, including without limitation the
# rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
# sell copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be included in
# all copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED *AS IS*, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
# IN THE SOFTWARE.
#
# --------------------------------------------------------------------------
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
# --------------------------------------------------------------------------
#
# Copyright (c) Microsoft Corporation. All rights reserved.
#
# The MIT License (MIT)
#
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the ""Software""), to
# deal in the Software without restriction, including without limitation the
# rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
# sell copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be included in
# all copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED *AS IS*, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
# IN THE SOFTWARE.
#
# --------------------------------------------------------------------------
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
# --------------------------------------------------------------------------
#
# Copyright (c) Microsoft Corporation. All rights reserved.
#
# The MIT License (MIT)
#
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the ""Software""), to
# deal in the Software without restriction, including without limitation the
# rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
# sell copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be included in
# all copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED *AS IS*, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
# IN THE SOFTWARE.
#
# --------------------------------------------------------------------------
from ._version import VERSION

__version__ = VERSION

from ._schema_registry_avro_serializer import SchemaRegistryAvroSerializer

__all__ = [
"SchemaRegistryAvroSerializer"
]
Loading