Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError: Must pass schema, or at least one RecordBatch in BigQuery #909

Closed
grieve54706 opened this issue Nov 14, 2024 · 0 comments · Fixed by #1013
Closed

ValueError: Must pass schema, or at least one RecordBatch in BigQuery #909

grieve54706 opened this issue Nov 14, 2024 · 0 comments · Fixed by #1013
Assignees
Labels
bigquery bug Something isn't working known issues

Comments

@grieve54706
Copy link
Contributor

We found the error when we query the table with INTERVAL or JSON column and the table is empty.

if record_batches and bqstorage_client is not None:
    return pyarrow.Table.from_batches(record_batches)
else:
    # No records (not record_batches), use schema based on BigQuery schema
    # **or**
    # we used the REST API (bqstorage_client is None),
    # which doesn't add arrow extension metadata, so we let
    # `bq_to_arrow_schema` do it.
    arrow_schema = _pandas_helpers.bq_to_arrow_schema(self._schema)
    return pyarrow.Table.from_batches(record_batches, schema=arrow_schema)

https://github.com/googleapis/python-bigquery/blob/main/google/cloud/bigquery/table.py#L1965-L1974

When the record_batches is None, the BigQuery client will get the arrow_schema. But the types INTERVAL and JSON are not in the mapping.

_BQ_TO_ARROW_SCALARS = {
    "BOOL": pyarrow.bool_,
    "BOOLEAN": pyarrow.bool_,
    "BYTES": pyarrow.binary,
    "DATE": pyarrow.date32,
    "DATETIME": pyarrow_datetime,
    "FLOAT": pyarrow.float64,
    "FLOAT64": pyarrow.float64,
    "GEOGRAPHY": pyarrow.string,
    "INT64": pyarrow.int64,
    "INTEGER": pyarrow.int64,
    "NUMERIC": pyarrow_numeric,
    "STRING": pyarrow.string,
    "TIME": pyarrow_time,
    "TIMESTAMP": pyarrow_timestamp,
}

https://github.com/googleapis/python-bigquery/blob/106161180ead01aca1ead909cf06ca559f68666d/google/cloud/bigquery/_pyarrow_helpers.py#L56-L71

They don't want to handle these. We can see these pending or closed issues and PRs.
googleapis/python-bigquery#1580
googleapis/python-bigquery#1832
googleapis/python-bigquery#826

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bigquery bug Something isn't working known issues
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants