Skip to content

Commit

Permalink
[BUG] Propogate S3Config.num_tries to pyarrow S3 filesystem (#2800)
Browse files Browse the repository at this point in the history
Addresses: #2788

Propogates the S3Config.num_tries config to the pyarrow S3 filesystem.

Note that the other relevant parameters on S3Config, `retry_mode` and
`retry_initial_backoff_ms`, are ignored as pyarrow's
[S3RetryStrategy](https://github.com/apache/arrow/blob/ab0a40ee34217070f14027776682074c55d0b507/python/pyarrow/_s3fs.pyx#L112)
only has one parameter `max_attempts`.

Note that this only addresses S3. GCSConfig and AzureConfig do not have
retry settings.
  • Loading branch information
jmurray-clarify authored Sep 6, 2024
1 parent 91d9fe9 commit e3fbf88
Showing 1 changed file with 7 additions and 0 deletions.
7 changes: 7 additions & 0 deletions daft/filesystem.py
Original file line number Diff line number Diff line change
Expand Up @@ -221,6 +221,13 @@ def _set_if_not_none(kwargs: dict[str, Any], key: str, val: Any | None):
_set_if_not_none(translated_kwargs, "session_token", s3_config.session_token)
_set_if_not_none(translated_kwargs, "region", s3_config.region_name)
_set_if_not_none(translated_kwargs, "anonymous", s3_config.anonymous)
if s3_config.num_tries is not None:
try:
from pyarrow.fs import AwsStandardS3RetryStrategy

translated_kwargs["retry_strategy"] = AwsStandardS3RetryStrategy(max_attempts=s3_config.num_tries)
except ImportError:
pass # Config does not exist in pyarrow 7.0.0

resolved_filesystem = S3FileSystem(**translated_kwargs)
resolved_path = resolved_filesystem.normalize_path(_unwrap_protocol(path))
Expand Down

0 comments on commit e3fbf88

Please sign in to comment.