Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Index out of bounds error durring read of an Avro file #12682

Open
JonasDev1 opened this issue Sep 30, 2024 · 1 comment · May be fixed by #12686
Open

Index out of bounds error durring read of an Avro file #12682

JonasDev1 opened this issue Sep 30, 2024 · 1 comment · May be fixed by #12686
Assignees
Labels
bug Something isn't working

Comments

@JonasDev1
Copy link

Describe the bug

I am faced with index out of bounds Error in the Avro Reader. From the stacktrace I can see that the size of a null_buffer is wrong initialized if the data type is a nested nullable struct array. The origin is that nullable Values are Union[_,Array] instead of just Array. Due to that the array_item_count is wrongly calculated in datafusion/core/src/datasource/avro_to_arrow/arrow_array_reader.rs:575

The issue can be solved with the maybe_resolve_union function

To Reproduce

Read a Avro File which contains a column in the following format:

{
      "name": "some_array",
      "type": [
        "null",
        {
          "type": "array",
          "items": {
            "type": "record",
            "name": "Item",
            "fields": [
              {
                "name": "id",
                "type": "long"
              }
             ]
          }
      ]
}

Expected behavior

No response

Additional context

Stacktrace:
index out of bounds: the len is 1 but the index is 1 stack backtrace: 0: rust_begin_unwind at /rustc/eeb90cda1969383f56a2637cbd3037bdf598841c/library/std/src/panicking.rs:665:5 1: core::panicking::panic_fmt at /rustc/eeb90cda1969383f56a2637cbd3037bdf598841c/library/core/src/panicking.rs:74:14 2: core::panicking::panic_bounds_check at /rustc/eeb90cda1969383f56a2637cbd3037bdf598841c/library/core/src/panicking.rs:276:5 3: arrow_buffer::util::bit_util::set_bit at /Users/JONSchmi/.cargo/registry/src/index.crates.io-6f17d22bba15001f/arrow-buffer-52.2.0/src/util/bit_util.rs:55:5 4: datafusion::datasource::avro_to_arrow::arrow_array_reader::AvroArrowArrayReader<R>::build_nested_list_array::{{closure}}::{{closure}} at /Users/JONSchmi/data-platform/services/kafka-ingest-lambda/datafusion/datafusion/core/src/datasource/avro_to_arrow/arrow_array_reader.rs:598:41

@JonasDev1 JonasDev1 added the bug Something isn't working label Sep 30, 2024
@JonasDev1
Copy link
Author

take

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant