Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avro import does not support 'enum' type #311

Closed
arcogabbo opened this issue Jul 3, 2024 · 5 comments
Closed

Avro import does not support 'enum' type #311

arcogabbo opened this issue Jul 3, 2024 · 5 comments

Comments

@arcogabbo
Copy link

datacontract-cli version: 0.10.8
command used: docker run --rm -v ${PWD}:/home/datacontract datacontract/cli import --format avro --source example.json

Currently I'm having issues importing an avro schema like the following:

{
  "name":"example",
  "type":"record",
  "fields":[
    {
      "name":"ExampleField",
      "type":{
        "type":"enum",
        "name":"ExampleEnum",
        "symbols":[
          "FOO",
          "BAR"
        ]
      },
      "default":"FOO"
    }
  ]
}

I'm receiving the following error:
DataContractException: Run operation failed: [schema] Map avro type to data contract type - None - failed - Unsupported type enum in avro schema. - datacontract

StackTrace
╭───────────────────── Traceback (most recent call last) ──────────────────────╮
│ /opt/venv/lib/python3.11/site-packages/datacontract/cli.py:236 in import_    │
│                                                                              │
│   233 │   """                                                                │
│   234 │   Create a data contract from the given source location. Prints to s │
│   235 │   """                                                                │
│ ❱ 236 │   result = DataContract().import_from_source(format, source, glue_ta │
│   237 │   console.print(result.to_yaml())                                    │
│   238                                                                        │
│   239                                                                        │
│                                                                              │
│ ╭────────────────────── locals ───────────────────────╮                      │
│ │      bigquery_dataset = None                        │                      │
│ │      bigquery_project = None                        │                      │
│ │        bigquery_table = None                        │                      │
│ │                format = <ImportFormat.avro: 'avro'> │                      │
│ │            glue_table = None                        │                      │
│ │                source = 'service-preferences.json'  │                      │
│ │ unity_table_full_name = None                        │                      │
│ ╰─────────────────────────────────────────────────────╯                      │
│                                                                              │
│ /opt/venv/lib/python3.11/site-packages/datacontract/data_contract.py:342 in  │
│ import_from_source                                                           │
│                                                                              │
│   339 │   │   if format == "sql":                                            │
│   340 │   │   │   data_contract_specification = import_sql(data_contract_spe │
│   341 │   │   elif format == "avro":                                         │
│ ❱ 342 │   │   │   data_contract_specification = import_avro(data_contract_sp │
│   343 │   │   elif format == "glue":                                         │
│   344 │   │   │   data_contract_specification = import_glue(data_contract_sp │
│   345 │   │   elif format == "jsonschema":                                   │
│                                                                              │
│ ╭───────────────────────────────── locals ─────────────────────────────────╮ │
│ │            bigquery_dataset = None                                       │ │
│ │            bigquery_project = None                                       │ │
│ │             bigquery_tables = None                                       │ │
│ │ data_contract_specification = DataContractSpecification(                 │ │
│ │                               │   dataContractSpecification='0.9.3',     │ │
│ │                               │   id='my-data-contract-id',              │ │
│ │                               │   info=Info(                             │ │
│ │                               │   │   title='My Data Contract',          │ │
│ │                               │   │   version='0.0.1',                   │ │
│ │                               │   │   status=None,                       │ │
│ │                               │   │   description=None,                  │ │
│ │                               │   │   owner=None,                        │ │
│ │                               │   │   contact=None                       │ │
│ │                               │   ),                                     │ │
│ │                               │   servers={},                            │ │
│ │                               │   terms=None,                            │ │
│ │                               │   models={},                             │ │
│ │                               │   definitions={},                        │ │
│ │                               │   examples=[],                           │ │
│ │                               │   quality=None,                          │ │
│ │                               │   servicelevels=None                     │ │
│ │                               )                                          │ │
│ │                      format = <ImportFormat.avro: 'avro'>                │ │
│ │                 glue_tables = None                                       │ │
│ │                        self = <datacontract.data_contract.DataContract   │ │
│ │                               object at 0xffff84835e90>                  │ │
│ │                      source = 'service-preferences.json'                 │ │
│ │       unity_table_full_name = None                                       │ │
│ ╰──────────────────────────────────────────────────────────────────────────╯ │
│                                                                              │
│ /opt/venv/lib/python3.11/site-packages/datacontract/imports/avro_importer.py │
│ :25 in import_avro                                                           │
│                                                                              │
│    22 │                                                                      │
│    23 │   # type record is being used for both the table and the object type │
│    24 │   # -> CONSTRAINT: one table per .avsc input, all nested records are │
│ ❱  25 │   fields = import_record_fields(avro_schema.fields)                  │
│    26 │                                                                      │
│    27 │   data_contract_specification.models[avro_schema.name] = Model(      │
│    28 │   │   fields=fields,                                                 │
│                                                                              │
│ ╭───────────────────────────────── locals ─────────────────────────────────╮ │
│ │                 avro_schema = <avro.schema.RecordSchema object at        │ │
│ │                               0xffff7a510fd0>                            │ │
│ │ data_contract_specification = DataContractSpecification(                 │ │
│ │                               │   dataContractSpecification='0.9.3',     │ │
│ │                               │   id='my-data-contract-id',              │ │
│ │                               │   info=Info(                             │ │
│ │                               │   │   title='My Data Contract',          │ │
│ │                               │   │   version='0.0.1',                   │ │
│ │                               │   │   status=None,                       │ │
│ │                               │   │   description=None,                  │ │
│ │                               │   │   owner=None,                        │ │
│ │                               │   │   contact=None                       │ │
│ │                               │   ),                                     │ │
│ │                               │   servers={},                            │ │
│ │                               │   terms=None,                            │ │
│ │                               │   models={},                             │ │
│ │                               │   definitions={},                        │ │
│ │                               │   examples=[],                           │ │
│ │                               │   quality=None,                          │ │
│ │                               │   servicelevels=None                     │ │
│ │                               )                                          │ │
│ │                        file = <_io.TextIOWrapper                         │ │
│ │                               name='service-preferences.json' mode='r'   │ │
│ │                               encoding='utf-8'>                          │ │
│ │                      source = 'service-preferences.json'                 │ │
│ ╰──────────────────────────────────────────────────────────────────────────╯ │
│                                                                              │
│ /opt/venv/lib/python3.11/site-packages/datacontract/imports/avro_importer.py │
│ :79 in import_record_fields                                                  │
│                                                                              │
│    76 │   │   │   imported_field.type = "array"                              │
│    77 │   │   │   imported_field.items = import_avro_array_items(field.type) │
│    78 │   │   else:  # primitive type                                        │
│ ❱  79 │   │   │   imported_field.type = map_type_from_avro(field.type.type)  │
│    80 │   │                                                                  │
│    81 │   │   imported_fields[field.name] = imported_field                   │
│    82                                                                        │
│                                                                              │
│ ╭───────────────────────────── locals ─────────────────────────────╮         │
│ │           field = <avro.schema.Field object at 0xffff8483c650>   │         │
│ │  imported_field = Field(                                         │         │
│ │                   │   ref=None,                                  │         │
│ │                   │   ref_obj=None,                              │         │
│ │                   │   title=None,                                │         │
│ │                   │   type=None,                                 │         │
│ │                   │   format=None,                               │         │
│ │                   │   required=True,                             │         │
│ │                   │   primary=None,                              │         │
│ │                   │   unique=None,                               │         │
│ │                   │   references=None,                           │         │
│ │                   │   description=None,                          │         │
│ │                   │   pii=None,                                  │         │
│ │                   │   classification=None,                       │         │
│ │                   │   pattern=None,                              │         │
│ │                   │   minLength=None,                            │         │
│ │                   │   maxLength=None,                            │         │
│ │                   │   minimum=None,                              │         │
│ │                   │   exclusiveMinimum=None,                     │         │
│ │                   │   maximum=None,                              │         │
│ │                   │   exclusiveMaximum=None,                     │         │
│ │                   │   enum=[],                                   │         │
│ │                   │   tags=[],                                   │         │
│ │                   │   fields={},                                 │         │
│ │                   │   items=None,                                │         │
│ │                   │   precision=None,                            │         │
│ │                   │   scale=None,                                │         │
│ │                   │   example=None,                              │         │
│ │                   │   config={'avroDefault': 'FOO'}              │         │
│ │                   )                                              │         │
│ │ imported_fields = {}                                             │         │
│ │   record_fields = [<avro.schema.Field object at 0xffff8483c650>] │         │
│ ╰──────────────────────────────────────────────────────────────────╯         │
│                                                                              │
│ /opt/venv/lib/python3.11/site-packages/datacontract/imports/avro_importer.py │
│ :151 in map_type_from_avro                                                   │
│                                                                              │
│   148 │   elif avro_type_str == "array":                                     │
│   149 │   │   return "array"                                                 │
│   150 │   else:                                                              │
│ ❱ 151 │   │   raise DataContractException(                                   │
│   152 │   │   │   type="schema",                                             │
│   153 │   │   │   result="failed",                                           │
│   154 │   │   │   name="Map avro type to data contract type",                │
│                                                                              │
│ ╭──────── locals ────────╮                                                   │
│ │ avro_type_str = 'enum' │                                                   │
│ ╰────────────────────────╯                                                   │
╰──────────────────────────────────────────────────────────────────────────────╯
@jochenchrist
Copy link
Contributor

@jochenchrist
Copy link
Contributor

Confirming that this is currently not implemented, but should.

If you have the ability and time to create a pull request, we're always happy to accept contributions.

@arcogabbo
Copy link
Author

can you also verify if map type is supported? I'm facing the same problem also with map type

@aniketkapdule
Copy link
Contributor

aniketkapdule commented Jul 11, 2024

I will put up a fix in a day or two, but if you are already working on it, feel free to raise a PR :)

@jochenchrist
Copy link
Contributor

Should now be fixed in v0.10.10

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants