Skip to content

Commit

Permalink
fix: do not spam the log with checksum related INFO messages when dow…
Browse files Browse the repository at this point in the history
…nloading using transfer_manager (#1357)

* fix: do not spam the log with checksum related INFO messages when downloading using transfer_manager

`download_chunks_concurrently` function does not allow to set `checksum` field in `download_kwargs`. It also does not set it on its own so it takes the default value of `"md5"` (see `Blob._prep_and_do_download`). Because ranged downloads do not return checksums it results in a lot of INFO messages (tens/hundreds):
```
INFO google.resumable_media._helpers - No MD5 checksum was returned from the service while downloading ...
(which happens for composite objects), so client-side content integrity checking is not being performed.
```
To fix it set the `checksum` field to `None` which means no checksum checking for individual chunks. Note that `transfer_manager` has its own checksum checking logic (enabled by `crc32c_checksum` argument)

* fix tests
  • Loading branch information
rafalh authored Oct 9, 2024
1 parent 8ec02c0 commit 42392ef
Show file tree
Hide file tree
Showing 2 changed files with 3 additions and 6 deletions.
2 changes: 2 additions & 0 deletions google/cloud/storage/transfer_manager.py
Original file line number Diff line number Diff line change
Expand Up @@ -885,6 +885,8 @@ def download_chunks_concurrently(
"'checksum' is in download_kwargs, but is not supported because sliced downloads have a different checksum mechanism from regular downloads. Use the 'crc32c_checksum' argument on download_chunks_concurrently instead."
)

download_kwargs = download_kwargs.copy()
download_kwargs["checksum"] = None
download_kwargs["command"] = "tm.download_sharded"

# We must know the size and the generation of the blob.
Expand Down
7 changes: 1 addition & 6 deletions tests/unit/test_transfer_manager.py
Original file line number Diff line number Diff line change
Expand Up @@ -606,6 +606,7 @@ def test_download_chunks_concurrently():

expected_download_kwargs = EXPECTED_DOWNLOAD_KWARGS.copy()
expected_download_kwargs["command"] = "tm.download_sharded"
expected_download_kwargs["checksum"] = None

with mock.patch("google.cloud.storage.transfer_manager.open", mock.mock_open()):
result = transfer_manager.download_chunks_concurrently(
Expand Down Expand Up @@ -636,9 +637,6 @@ def test_download_chunks_concurrently_with_crc32c():
blob_mock.size = len(BLOB_CONTENTS)
blob_mock.crc32c = "eOVVVw=="

expected_download_kwargs = EXPECTED_DOWNLOAD_KWARGS.copy()
expected_download_kwargs["command"] = "tm.download_sharded"

def write_to_file(f, *args, **kwargs):
f.write(BLOB_CHUNK)

Expand All @@ -664,9 +662,6 @@ def test_download_chunks_concurrently_with_crc32c_failure():
blob_mock.size = len(BLOB_CONTENTS)
blob_mock.crc32c = "invalid"

expected_download_kwargs = EXPECTED_DOWNLOAD_KWARGS.copy()
expected_download_kwargs["command"] = "tm.download_sharded"

def write_to_file(f, *args, **kwargs):
f.write(BLOB_CHUNK)

Expand Down

0 comments on commit 42392ef

Please sign in to comment.