-
Notifications
You must be signed in to change notification settings - Fork 154
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow tracking upload progress. #27
Comments
@frankyn Please help prioritize this feature. See the discussion in googleapis/google-cloud-python#1830 / googleapis/google-cloud-python#1077 for the tradeoffs involved. |
Here's a workaround using tqdm.wrapattr: import os
from google.cloud import storage
def upload_blob(client, bucket_name, source, dest, content_type=None):
bucket = client.bucket(bucket_name)
blob = bucket.blob(dest)
with open(source, "rb") as in_file:
total_bytes = os.fstat(in_file.fileno()).st_size
with tqdm.wrapattr(in_file, "read", total=total_bytes, miniters=1, desc="upload to %s" % bucket_name) as file_obj:
blob.upload_from_file(
file_obj,
content_type=content_type,
size=total_bytes,
)
return blob
if __name__ == "__main__":
upload_blob(storage.Client(), "bucket", "/etc/motd", "/path/to/blob.txt", "text/plain") |
One year since this has opened, any updates? |
This is an essential feature for large file uploads/downloads. I resorted to using gsutil via subprocess call just for the download progress bar. |
To add to @pdex 's submission: object_address = str(uuid.uuid4())
upload_url, upload_method = get_upload_url(object_address) # fetches signed upload URL
size = os.path.getsize(filename)
with open(filename, "rb") as in_file:
total_bytes = os.fstat(in_file.fileno()).st_size
with tqdm.wrapattr(
in_file,
"read",
total=total_bytes,
miniters=1,
desc="Uploading to my bucket",
) as file_obj:
response = requests.request(
method=upload_method,
url=upload_url,
data=file_obj,
headers={"Content-Type": "application/octet-stream"},
)
response.raise_for_status()
return object_address, size |
@frankyn any updates on this, been going from 2019. |
Thanks for the ping. @andrewsg this has +17 upvotes could you please take a look when you have a moment? |
We have some long-term plans around async code and transport mechanisms that may make fully integrated support for a progress meter feasible in the future, but until then, there are two main options: chunk media operations and report status in between chunks, or use a file object wrapper that tracks how much data is written or read. As it happens, large uploads are already chunked by default using the resumable upload API. However, upload functions in the Python client library are agnostic as to the upload strategy and so we can't easily add callback functionality to upload functions in a way that will work for all uploads - they would only work for resumable uploads, and communicating that to the user would be awkward. At any rate, they will only report completed chunks, so they're inferior to the file object wrapper method. I'll look into implementing a good first-party turnkey solution for the file object wrapper strategy. Until then, I recommend use of the |
+1 to this |
This is related to googleapis/google-cloud-python#1830 reopening here as this seems to have been closed many years ago.
We are please looking for this feature as we need to monitor large files being uploaded to Google Storage buckets. I am surprised not many people are after this essential feature, which makes me feel we haven't done our research properly or that the solution is very obvious or trivial.
Can someone please share an example of how we could track progress during upload?
Update: Should we be looking at google-resumable-media? will try that out and report back.
Thanks
The text was updated successfully, but these errors were encountered: