-
Notifications
You must be signed in to change notification settings - Fork 14.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
iterate through blobs before checking prefixes #36202
iterate through blobs before checking prefixes #36202
Conversation
ef32ffd
to
f5899d8
Compare
f5899d8
to
5a53d2e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This PR will potentially fix the error on the google provider rc. Could you kindly include it in the next rc for google provider. cc @potiuk |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add tests
This test https://github.com/apache/airflow/pull/36202/files#diff-76fef248bc8b2046dae5e6e1fe60dcb916e4114d85190c563454d2d123937b43R829 mainly checks the prefixes is actually used if exist https://github.com/apache/airflow/pull/36202/files#diff-82854006b5553665046db26d43a9dfa90bec78d4ba93e2d2ca7ff5bf632fa624R944-R945. |
40b22ed
to
1ea2c4c
Compare
According to https://github.com/googleapis/python-storage/blob/v2.14.0/google/cloud/storage/client.py#L1213-L1217, the prefixes are not returned until the blobs are consumed
1ea2c4c
to
0e77b7e
Compare
0e77b7e
to
8fe8f77
Compare
Just in time for RC3 |
Looks great! Thanks :) |
After reading https://cloud.google.com/storage/docs/json_api/v1/objects/list, https://github.com/googleapis/python-storage/blob/v2.14.0/google/cloud/storage/client.py#L1143-L1145C35 and https://github.com/apache/airflow/blob/providers-google/10.13.0rc2/airflow/providers/google/cloud/hooks/gcs.py#L732, I suspect we might not use the
delimiter
the right way. But it's going to be deprecated. So it might be better for us to keep the original behavior.According to https://github.com/googleapis/python-storage/blob/v2.14.0/google/cloud/storage/client.py#L1213-L1217, the prefixes are not returned until the
blobs
are consumed. Thus, to keep the behavior of checking whetherprefixes
exists and decide the extended content, we'll need to consumeblobs
firstRelated: #36130
^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named
{pr_number}.significant.rst
or{issue_number}.significant.rst
, in newsfragments.