Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] BlobClient.download fails with "Connection reset by peer" for large files #21066

Closed
jgmpzman opened this issue Apr 29, 2021 · 5 comments
Closed
Assignees
Labels
Client This issue points to a problem in the data-plane of the library. customer-reported Issues that are reported by GitHub users external to the Azure organization. question The issue doesn't require a change to the product in order to be resolved. Most issues start as that Storage Storage Service (Queues, Blobs, Files)

Comments

@jgmpzman
Copy link

Describe the bug
We are using the azure java sdk in our product code. Part of the functionality that we need to support is downloading a file from azure. This works consistently for files under 10 Gb. There are issues when trying to download block blobs that are greater than 10 Gb, especially around 90 Gb. The api has no problem uploading the 90 Gb file, it is just the downloading portion that seems to fail frequently. It will occasionally work, but it seems to fail much more than it succeeds. I am wondering if perhaps there is a better api method that should be used for larger files, any help would be greatly appreciated?

Exception or Stack Trace
2021-04-29 20:32:49 246 [main] ERROR com.jgm.AzureFileDownload - Error while trying to download the file=/home/kompuser/90gb.txt
reactor.core.Exceptions$ReactiveException: io.netty.channel.unix.Errors$NativeIoException: readAddress(..) failed: Connection reset by peer
at reactor.core.Exceptions.propagate(Exceptions.java:393)
at reactor.core.publisher.BlockingSingleSubscriber.blockingGet(BlockingSingleSubscriber.java:97)
at reactor.core.publisher.Mono.block(Mono.java:1680)
at com.azure.storage.common.implementation.StorageImplUtils.blockWithOptionalTimeout(StorageImplUtils.java:99)
at com.azure.storage.blob.specialized.BlobClientBase.downloadWithResponse(BlobClientBase.java:562)
at com.azure.storage.blob.specialized.BlobClientBase.download(BlobClientBase.java:522)
at com.jgm.AzureFileDownload.getObject(AzureFileDownload.java:41)
at com.jgm.AzureFileDownload.main(AzureFileDownload.java:61)
Suppressed: java.lang.Exception: #block terminated with an error
at reactor.core.publisher.BlockingSingleSubscriber.blockingGet(BlockingSingleSubscriber.java:99)
... 6 more
Caused by: io.netty.channel.unix.Errors$NativeIoException: readAddress(..) failed: Connection reset by peer

To Reproduce

  1. Upload a 90 Gb block blob file to azure (took about 75 minutes)
  2. Use BlobClient.download(FileOutputStream) to download the file

Code Snippet
A shorter variation of the code is here:

  public class AzureFileDownload {

    private static final Logger log = LoggerFactory.getLogger(AzureFileDownload.class);

    private final BlobContainerClient container;

    public AzureFileDownload(String bucketName, String accountName, String accountKey) {
        String endpoint = String.format(Locale.ROOT, "https://%s.blob.core.windows.net", accountName);
        this.container = createBlobServiceClient(new StorageSharedKeyCredential(accountName, accountKey), endpoint).getBlobContainerClient(bucketName);
    }

    private BlobClient createBlobClient(String key) {
        String encodedKey = Utility.urlEncode(key);     // Encode + and %
        return container.getBlobClient(encodedKey);
    }

    private BlobServiceClient createBlobServiceClient(StorageSharedKeyCredential credential, String endpoint) {
        return new BlobServiceClientBuilder().credential(credential).endpoint(endpoint).buildClient();
    }

    public void getObject(String key, File file) {
        try (OutputStream outputStream = new FileOutputStream(file)) {
            BlobClient blob = createBlobClient(key);
            blob.download(outputStream);
        } catch (Exception e) {
            log.error("Error while trying to download the file={}", file, e);
        }
    }

    public static void main(String[] args) {
        if (args.length < 5) {
            log.info("The following parameters are needed in order to run this code: {accountName} {bucketName} {accountKey} {fileKeyToDownload} {filePathToDownloadTo]");
            System.exit(0);
        }

        String accountName = args[0];
        String bucketName = args[1];
        String accountKey = args[2];
        String fileKey = args[3];
        File downloadTo = new File(args[4]);

        log.debug("bucketName={} accountName={}", bucketName, accountName);
        AzureFileDownload azureFileDownload = new AzureFileDownload(bucketName, accountName, accountKey);
        azureFileDownload.getObject(fileKey, downloadTo);
    }

Expected behavior
I would expect this call to be successful more than it fails.

Screenshots
If applicable, add screenshots to help explain your problem.

Setup (please complete the following information):

  • OS: MacOS BigSur 11.3 and CentOS Linux release 7.8.2003 (Core)
  • IDE : IntelliJ
  • 12.9.0

Additional context
Logs from one failed run set to DEBUG: https://pastebin.com/G3wXVQrw

Information Checklist
Kindly make sure that you have added all the following information above and checkoff the required fields otherwise we will treat the issuer as an incomplete report

  • [X ] Bug Description Added
  • [X ] Repro Steps Added
  • [X ] Setup information Added
@ghost ghost added needs-triage Workflow: This is a new issue that needs to be triaged to the appropriate team. customer-reported Issues that are reported by GitHub users external to the Azure organization. question The issue doesn't require a change to the product in order to be resolved. Most issues start as that labels Apr 29, 2021
@gapra-msft
Copy link
Member

Hi @jgmpzman

Thank you for reporting this issue.

@jaschrep-msft Could you please take a look at this tomorrow?

@joshfree joshfree added Client This issue points to a problem in the data-plane of the library. Storage Storage Service (Queues, Blobs, Files) labels Apr 30, 2021
@ghost ghost removed the needs-triage Workflow: This is a new issue that needs to be triaged to the appropriate team. label Apr 30, 2021
@gapra-msft
Copy link
Member

Hi @jgmpzman, it looks like the default retry policy for downloads is to not retry, could you please try using the downloadWithResponse API and set DownloadRetryOptions.maxRetryRequests

@jgmpzman
Copy link
Author

Hi @gapra-msft, it is working after using downloadToFileWithResponse and specifying some retries. Downloading a large file does seem to take much more time than uploading that same file. Do you happen to know if the retries start the file download over from the beginning or is the api code able to not lose its place in the file and continue downloading from the spot that failed?

@gapra-msft
Copy link
Member

Hi @jgmpzman

For both downloadWithResponse and downloadToFileWithResponse the retry will start downloading data from where it lost its place. I've added additional detail below.

For downloadWithResponse, the retry will start downloading data from where it lost its place. Here is the link to the logic for that in case you are interested.

For downloadToFileWithResponse, the retry will also only retry the failed download call (depending on what chunk we are on). The logic is slightly different here since we call download in chunks (by default it's 4MB but you can change this by setting ParallelTransferOptions.blockSizeLong)

@jgmpzman
Copy link
Author

@gapra-msft thanks for the information!

@github-actions github-actions bot locked and limited conversation to collaborators Apr 12, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Client This issue points to a problem in the data-plane of the library. customer-reported Issues that are reported by GitHub users external to the Azure organization. question The issue doesn't require a change to the product in order to be resolved. Most issues start as that Storage Storage Service (Queues, Blobs, Files)
Projects
None yet
Development

No branches or pull requests

4 participants