Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support location for BigQueryInsertJobOperator in deferrable mode #37282

Conversation

moiseenkov
Copy link
Contributor

@moiseenkov moiseenkov commented Feb 9, 2024

This PR provides a workaround solution for the issue #35833.

It is really necessary because users are unable to use BigQueryInsertJobOperator in deferrable mode in regions other than EU, US or us-central1. This fix was provided now because there is no response neither from gcloud-aio nor google-cloud-bigquery regarding their plans for providing async get_job method with the optional location parameter.

The proposed solution:

  1. The BigQueryAsyncHook wraps the call of the synchronous method BigQueryAsyncHook.get_job() in asyncio.loop.run_in_executor().
  2. The optional parameter location is passed from the BigQueryInsertJobOperator to the BigQueryInsertJobTrigger, where it is handed to the hook.
  3. Because the BigQueryInsertJobTrigger is a parent class for other four triggers (see below), they were updated in order to comply with the base class attributes.
BigQueryCheckTrigger,
BigQueryGetDataTrigger,
BigQueryIntervalCheckTrigger,
BigQueryValueCheckTrigger
  1. Despite the modification, triggers from (3) and all the operators that use these triggers and changed async method BigQueryAsyncHook.get_job_status() don;t change behavior because the optional parameter location is ignored there. However, it would be very nice to fix them too and test carefully in a separated PR:
BigQueryCheckOperator,
BigQueryGetDataOperator,
BigQueryIntervalCheckOperator,
BigQueryValueCheckOperator,
BigQueryTablePartitionExistenceSensor,
BigQueryToGCSOperator,
GCSToBigQueryOperator

closes: #35833

@boring-cyborg boring-cyborg bot added area:providers provider:google Google (including GCP) related issues labels Feb 9, 2024
@moiseenkov moiseenkov closed this Feb 9, 2024
@moiseenkov moiseenkov changed the title Add optional 'location' parameter to the BigQueryInsertJobTrigger Support location for BigQueryInsertJobOperator in deferrable mode Feb 12, 2024
@moiseenkov moiseenkov reopened this Feb 12, 2024
@moiseenkov moiseenkov force-pushed the bigquery_insert_job_operator_trigger_location branch from 2f81cba to 89a8da6 Compare February 12, 2024 09:47
@VladaZakharova
Copy link
Contributor

Hi @eladkal @potiuk !
Can you please check changes in this PR? Thank you!

@potiuk potiuk merged commit d43c804 into apache:main Feb 12, 2024
55 checks passed
@spencertollefson
Copy link
Contributor

A little late here - but gcloud-aio released today the inclusion of a location parameter:
https://github.com/talkiq/gcloud-aio/releases/tag/bigquery-7.1.0

Perhaps with this work around merged in, we no longer need it. But we could also build around leveraging that param now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:providers provider:google Google (including GCP) related issues
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BigQueryInsertJobOperator doesn't include BigQuery region/location in deferrable mode
4 participants