Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BigQueryDataTransferServiceStartTransferRunsOperator got Access Denied error on long-running migration jobs. #37557

Open
1 of 2 tasks
okayhooni opened this issue Feb 20, 2024 · 1 comment
Labels
area:providers good first issue kind:bug This is a clearly a bug provider:google Google (including GCP) related issues

Comments

@okayhooni
Copy link
Contributor

Apache Airflow Provider(s)

google

Versions of Apache Airflow Providers

latest

Apache Airflow version

2.3.x ~ latest

Operating System

Debian GNU/Linux 11 (bullseye)

Deployment

Official Apache Airflow Helm Chart

Deployment details

deployed on EKS cluster with customized Airflow Helm chart based on the official chart

What happened

BigQueryDataTransferServiceStartTransferRunsOperator got Access Denied error on long-running migration jobs.
Especially, when BigQuery Data Transfer job triggered by Airflow operator, exceeds one hour, it fails due to the expiration of the credential (default lifespan = 1 hour).
But, I sometimes experienced the same Access Denied issue even on the job that has been submitted for less than 10 minutes.

Access Denied: Table projectid:table_name: Permission bigquery.tables.get denied on table projectid:datasetid.table_name (or it may not exist).

(those DTS job error logs can also be seen on BigQuery Data transfer console)

At first, I thought it was a token expiration issue, so I attempted to refresh the token; however, it did not have any effect on resolving the issue.

What you think should happen instead

  • BigQueryDataTransferServiceStartTransferRunsOperator task has to be success regardless of the running duration of migration jobs.
  • As I mentioned above, I sometimes experienced the same Access Denied issue even on the job that has been submitted for less than 10 minutes. (So, I guess it is the issue on the GCP itself)

How to reproduce

  • Runnining the BigQuery DTS job for migrating a sufficiently large data source that takes more than an hour.
  • It can be sometimes reproduced on shorter DTS job (< 10 minutes)

Anything else

Related issue:

Related MR:

Related Google BigQuery Docs:

Error: Access Denied: ... Permission bigquery.tables.get denied on table ...
Resolution: Confirm that the BigQuery Data Transfer Service service agent is granted the bigquery.dataEditor role on the target dataset. This grant is automatically applied when creating and updating the transfer, but it's possible that the access policy was modified manually afterwards. To regrant the permission, see Grant access to a dataset.

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

@okayhooni okayhooni added area:providers kind:bug This is a clearly a bug needs-triage label for new issues that we didn't triage yet labels Feb 20, 2024
Copy link

boring-cyborg bot commented Feb 20, 2024

Thanks for opening your first issue here! Be sure to follow the issue template! If you are willing to raise PR to address this issue please do so, no need to wait for approval.

@potiuk potiuk added good first issue and removed needs-triage label for new issues that we didn't triage yet labels Feb 24, 2024
@eladkal eladkal added the provider:google Google (including GCP) related issues label Feb 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:providers good first issue kind:bug This is a clearly a bug provider:google Google (including GCP) related issues
Projects
None yet
Development

No branches or pull requests

3 participants