[BUG] BlobClient.download fails with "Connection reset by peer" for large files #21066

jgmpzman · 2021-04-29T21:20:16Z

Describe the bug
We are using the azure java sdk in our product code. Part of the functionality that we need to support is downloading a file from azure. This works consistently for files under 10 Gb. There are issues when trying to download block blobs that are greater than 10 Gb, especially around 90 Gb. The api has no problem uploading the 90 Gb file, it is just the downloading portion that seems to fail frequently. It will occasionally work, but it seems to fail much more than it succeeds. I am wondering if perhaps there is a better api method that should be used for larger files, any help would be greatly appreciated?

Exception or Stack Trace
2021-04-29 20:32:49 246 [main] ERROR com.jgm.AzureFileDownload - Error while trying to download the file=/home/kompuser/90gb.txt
reactor.core.Exceptions$ReactiveException: io.netty.channel.unix.Errors$NativeIoException: readAddress(..) failed: Connection reset by peer
at reactor.core.Exceptions.propagate(Exceptions.java:393)
at reactor.core.publisher.BlockingSingleSubscriber.blockingGet(BlockingSingleSubscriber.java:97)
at reactor.core.publisher.Mono.block(Mono.java:1680)
at com.azure.storage.common.implementation.StorageImplUtils.blockWithOptionalTimeout(StorageImplUtils.java:99)
at com.azure.storage.blob.specialized.BlobClientBase.downloadWithResponse(BlobClientBase.java:562)
at com.azure.storage.blob.specialized.BlobClientBase.download(BlobClientBase.java:522)
at com.jgm.AzureFileDownload.getObject(AzureFileDownload.java:41)
at com.jgm.AzureFileDownload.main(AzureFileDownload.java:61)
Suppressed: java.lang.Exception: #block terminated with an error
at reactor.core.publisher.BlockingSingleSubscriber.blockingGet(BlockingSingleSubscriber.java:99)
... 6 more
Caused by: io.netty.channel.unix.Errors$NativeIoException: readAddress(..) failed: Connection reset by peer

To Reproduce

Upload a 90 Gb block blob file to azure (took about 75 minutes)
Use BlobClient.download(FileOutputStream) to download the file

Code Snippet
A shorter variation of the code is here:

  public class AzureFileDownload {

    private static final Logger log = LoggerFactory.getLogger(AzureFileDownload.class);

    private final BlobContainerClient container;

    public AzureFileDownload(String bucketName, String accountName, String accountKey) {
        String endpoint = String.format(Locale.ROOT, "https://%s.blob.core.windows.net", accountName);
        this.container = createBlobServiceClient(new StorageSharedKeyCredential(accountName, accountKey), endpoint).getBlobContainerClient(bucketName);
    }

    private BlobClient createBlobClient(String key) {
        String encodedKey = Utility.urlEncode(key);     // Encode + and %
        return container.getBlobClient(encodedKey);
    }

    private BlobServiceClient createBlobServiceClient(StorageSharedKeyCredential credential, String endpoint) {
        return new BlobServiceClientBuilder().credential(credential).endpoint(endpoint).buildClient();
    }

    public void getObject(String key, File file) {
        try (OutputStream outputStream = new FileOutputStream(file)) {
            BlobClient blob = createBlobClient(key);
            blob.download(outputStream);
        } catch (Exception e) {
            log.error("Error while trying to download the file={}", file, e);
        }
    }

    public static void main(String[] args) {
        if (args.length < 5) {
            log.info("The following parameters are needed in order to run this code: {accountName} {bucketName} {accountKey} {fileKeyToDownload} {filePathToDownloadTo]");
            System.exit(0);
        }

        String accountName = args[0];
        String bucketName = args[1];
        String accountKey = args[2];
        String fileKey = args[3];
        File downloadTo = new File(args[4]);

        log.debug("bucketName={} accountName={}", bucketName, accountName);
        AzureFileDownload azureFileDownload = new AzureFileDownload(bucketName, accountName, accountKey);
        azureFileDownload.getObject(fileKey, downloadTo);
    }

Expected behavior
I would expect this call to be successful more than it fails.

Screenshots
If applicable, add screenshots to help explain your problem.

Setup (please complete the following information):

OS: MacOS BigSur 11.3 and CentOS Linux release 7.8.2003 (Core)
IDE : IntelliJ
12.9.0

Additional context
Logs from one failed run set to DEBUG: https://pastebin.com/G3wXVQrw

Information Checklist
Kindly make sure that you have added all the following information above and checkoff the required fields otherwise we will treat the issuer as an incomplete report

[X ] Bug Description Added
[X ] Repro Steps Added
[X ] Setup information Added

The text was updated successfully, but these errors were encountered:

gapra-msft · 2021-04-30T00:49:51Z

Hi @jgmpzman

Thank you for reporting this issue.

@jaschrep-msft Could you please take a look at this tomorrow?

gapra-msft · 2021-05-07T20:58:02Z

Hi @jgmpzman, it looks like the default retry policy for downloads is to not retry, could you please try using the downloadWithResponse API and set DownloadRetryOptions.maxRetryRequests

jgmpzman · 2021-05-11T16:13:42Z

Hi @gapra-msft, it is working after using downloadToFileWithResponse and specifying some retries. Downloading a large file does seem to take much more time than uploading that same file. Do you happen to know if the retries start the file download over from the beginning or is the api code able to not lose its place in the file and continue downloading from the spot that failed?

gapra-msft · 2021-05-11T17:30:27Z

Hi @jgmpzman

For both downloadWithResponse and downloadToFileWithResponse the retry will start downloading data from where it lost its place. I've added additional detail below.

For downloadWithResponse, the retry will start downloading data from where it lost its place. Here is the link to the logic for that in case you are interested.

For downloadToFileWithResponse, the retry will also only retry the failed download call (depending on what chunk we are on). The logic is slightly different here since we call download in chunks (by default it's 4MB but you can change this by setting ParallelTransferOptions.blockSizeLong)

jgmpzman · 2021-05-11T17:52:00Z

@gapra-msft thanks for the information!

kasobol-msft mentioned this issue Apr 30, 2021

[BUG] Connection reset by peer is not handled well by retry policy #21091

Closed

joshfree assigned jaschrep-msft Apr 30, 2021

joshfree added Client This issue points to a problem in the data-plane of the library. Storage Storage Service (Queues, Blobs, Files) labels Apr 30, 2021

ghost removed the needs-triage Workflow: This is a new issue that needs to be triaged to the appropriate team. label Apr 30, 2021

jgmpzman closed this as completed May 11, 2021

gapra-msft mentioned this issue Jun 15, 2021

[BUG] readAddress(..) failed: Connection reset by peer for big file #22268

Closed

3 tasks

github-actions bot locked and limited conversation to collaborators Apr 12, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] BlobClient.download fails with "Connection reset by peer" for large files #21066

[BUG] BlobClient.download fails with "Connection reset by peer" for large files #21066

jgmpzman commented Apr 29, 2021

gapra-msft commented Apr 30, 2021

gapra-msft commented May 7, 2021

jgmpzman commented May 11, 2021

gapra-msft commented May 11, 2021

jgmpzman commented May 11, 2021

[BUG] BlobClient.download fails with "Connection reset by peer" for large files #21066

[BUG] BlobClient.download fails with "Connection reset by peer" for large files #21066

Comments

jgmpzman commented Apr 29, 2021

gapra-msft commented Apr 30, 2021

gapra-msft commented May 7, 2021

jgmpzman commented May 11, 2021

gapra-msft commented May 11, 2021

jgmpzman commented May 11, 2021