Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upload blob async frozen when using memory stream, no error #10814

Closed
diegosasw opened this issue Mar 24, 2020 · 24 comments · Fixed by #14301
Closed

Upload blob async frozen when using memory stream, no error #10814

diegosasw opened this issue Mar 24, 2020 · 24 comments · Fixed by #14301
Assignees
Labels
bug This issue requires a change to an existing behavior in the product in order to be resolved. Client This issue points to a problem in the data-plane of the library. customer-reported Issues that are reported by GitHub users external to the Azure organization. Service Attention Workflow: This issue is responsible by Azure service team. Storage Storage Service (Queues, Blobs, Files)
Milestone

Comments

@diegosasw
Copy link

diegosasw commented Mar 24, 2020

I'm experiencing a problem with Azure.Storage.Blobs 12.4.0
No error, just frozen forever when trying to upload a stream (an in-memory json string) using Azurite.

It seems the server receives the attempt, but does nothing and the API does not throw any error either:

sqlserver_1  | 172.18.0.1 - - [24/Mar/2020:01:38:55 +0000] "PUT /devstoreaccount1/landing/test637206559996473201.json?st=2020-03-23T13%3A57%3A00Z&se=2020-03-25T13%3A57%3A37Z&sp=racwdl&sv=2018-03-28&sr=c&sig=RYDvl1aQeEIY3isg17LGGtx63Krum13gTXgn981cOco%3D HTTP/1.1" - -

Azurite is running, I can access with the Azure Blob Storage Explorer and generate a sas token uri pasted below.

My code:

var fileName = $"test{DateTime.UtcNow.Ticks}.json";
var fileStream = new MemoryStream();
fileStream.Write(Encoding.UTF8.GetBytes("{'foo': 'bar'}"));
var blobContainerClient = new BlobContainerClient("http://127.0.0.1:10000/devstoreaccount1/landing?st=2020-03-23T13%3A57%3A00Z&se=2020-03-25T13%3A57%3A37Z&sp=racwdl&sv=2018-03-28&sr=c&sig=RYDvl1aQeEIY3isg17LGGtx63Krum13gTXgn981cOco%3D");
await blobContainerClient.UploadBlobAsync(fileName, fileStream); //this gets stuck, no error, just frozen there

PS: The Azurite I use for the test is:

version: '3'
services:
    sqlserver:
        image: mcr.microsoft.com/azure-storage/azurite
        restart: always
        ports:
            - 10000:10000
            - 10001:10001
# Run
# docker-compose -f azurite.yml up -d 

Can anybody spot what the issue might be?

@triage-new-issues triage-new-issues bot added the needs-triage Workflow: This is a new issue that needs to be triaged to the appropriate team. label Mar 24, 2020
@diegosasw
Copy link
Author

I've also tried with a real azure blob storage with a sas uri:

https://hidden.blob.core.windows.net/prueba?st=2020-03-23T14%3A24%3A00Z&se=2020-03-26T14%3A24%3A00Z&sp=racwdl&sv=2018-03-28&sr=c&sig=DLVa3Lbg8JKZlCGRSDs1M%2F69Cc4s9F7keFEE1roRScA%3D

same result, so it does not seem related to Azurite, it must be something in my code sample or with the Azure.Storage.Blobs 12.4.0

@diegosasw diegosasw changed the title Upload blob async frozen, no error Upload blob async frozen when using memory stream, no error Mar 24, 2020
@diegosasw
Copy link
Author

diegosasw commented Mar 24, 2020

Just ran another test with a different stream. This time I read a PDF as a FileStream
and it works fine for both real Azure Blob Storage and Azurite
.
So the problem is with the memory stream

var fileStream = new MemoryStream();
fileStream.Write(Encoding.UTF8.GetBytes("{'foo': 'bar'}"));

Not sure whether the stream is not properly created and opened to be read, but in any case maybe it should throw some error and not freeze.

The sample below works fine.

var fileName = $"test{DateTime.UtcNow.Ticks}.pdf";
var fileStream = new FileStream("TestSupport\\sample.pdf", FileMode.Open, FileAccess.Read);
var blobContainerClient = new BlobContainerClient("http://127.0.0.1:10000/devstoreaccount1/landing?st=2020-03-23T13%3A57%3A00Z&se=2020-03-25T13%3A57%3A37Z&sp=racwdl&sv=2018-03-28&sr=c&sig=RYDvl1aQeEIY3isg17LGGtx63Krum13gTXgn981cOco%3D");
await blobContainerClient.UploadBlobAsync(fileName, fileStream); //it works!

@jsquire jsquire added Client This issue points to a problem in the data-plane of the library. customer-reported Issues that are reported by GitHub users external to the Azure organization. Service Attention Workflow: This issue is responsible by Azure service team. Storage Storage Service (Queues, Blobs, Files) and removed needs-triage Workflow: This is a new issue that needs to be triaged to the appropriate team. labels Mar 24, 2020
@ghost
Copy link

ghost commented Mar 24, 2020

Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @xgithubtriage.

@DarrenClevenger
Copy link

DarrenClevenger commented Mar 24, 2020

I was also having this issue. I found that I had to reset the position to 0 to get this to work.

using var ms = new MemoryStream(); 
ms.Write(ASCIIEncoding.UTF8.GetBytes(jsonData));  
ms.Position = 0;
client.UploadBlob(path, ms);

@francescocristallo
Copy link

francescocristallo commented Mar 28, 2020

I was also having this issue. I found that I had to reset the position to 0 to get this to work.

using var ms = new MemoryStream(); 
ms.Write(ASCIIEncoding.UTF8.GetBytes(jsonData));  
ms.Position = 0;
client.UploadBlob(path, ms);

I have the same problem with the DataLake Gen2 APIs.
Resetting the position to 0 worked for me.

@pakrym pakrym added the bug This issue requires a change to an existing behavior in the product in order to be resolved. label Mar 30, 2020
@pakrym
Copy link
Contributor

pakrym commented Mar 30, 2020

Seems like a bug, when the stream position is not reset the expected behavior should be uploading an empty blob, not hanging.

@amishra-dev
Copy link
Contributor

@kasobol-msft can you please triage

@pakrym
Copy link
Contributor

pakrym commented Apr 1, 2020

Not sure if it's appropriate to rewind customers stream. Does MemoryStream.Read hang when it's at the end?

AFAIK it returns 0 immediately https://github.com/dotnet/runtime/blob/master/src/libraries/System.Private.CoreLib/src/System/IO/MemoryStream.cs#L354

@seanmcc-msft
Copy link
Member

I'm seen MemoryStream.Read() hang when it's at the end. We can reproduce fairly easily by writing data to a memory stream, and then attempting to upload it without rewinding it.

@pakrym
Copy link
Contributor

pakrym commented Apr 1, 2020

Just tried it,

using System;

namespace ms_read
{
    class Program
    {
        static void Main(string[] args)
        {
            var s = new System.IO.MemoryStream();
            var bytes = new byte[100];
            for (int i = 0; i < 10 ; i++)
            {
                Console.WriteLine(s.Read(bytes));
            }
        }
    }
}

Outputs:

D:\temp\ms-read> dotnet run
0
0
0
0
0
0
0
0
0
0

@seanmcc-msft
Copy link
Member

        [Test]
        public async Task UploadAsync_MemoryStream()
        {
            // Arrange
            await using DisposingContainer test = await GetTestContainerAsync();

            var blockBlobName = GetNewBlobName();
            BlockBlobClient blob = InstrumentClient(test.Container.GetBlockBlobClient(blockBlobName));
            var data = GetRandomBuffer(Size);
            MemoryStream stream = new MemoryStream();
            stream.Write(data, 0, data.Length);

            // Act - hangs here.
            await blob.UploadAsync(content: stream);
        }

We end up calling _request.Content = Azure.Core.RequestContent.Create(body); when creating a request based on the stream.

@pakrym
Copy link
Contributor

pakrym commented Apr 1, 2020

I'm not arguing that upload doesn't hang, just wonder if it hangs on memoryStream.Read or something else like our parallel upload logic.

@seanmcc-msft
Copy link
Member

Try MemoryStream.Read() on a non-empty MemoryStream where offset == length.

@pakrym
Copy link
Contributor

pakrym commented Apr 1, 2020

 var bytes = new byte[100];
            var s = new System.IO.MemoryStream(bytes);

            for (int i = 0; i < 5 ; i++)
            {
                Console.WriteLine($"{s.Position} {s.Length}");
                Console.WriteLine(s.Read(bytes));
            }
D:\temp\ms-read> dotnet run
0 100
100
100 100
0
100 100
0
100 100
0
100 100
0

@seanmcc-msft
Copy link
Member

Hmm maybe we are seeking outside the range of the MemoryStream or something in our Upload logic.

@jsquire jsquire added this to the Backlog milestone Jun 2, 2020
@ANRCorleone
Copy link

I was also having this issue. I found that I had to reset the position to 0 to get this to work.

using var ms = new MemoryStream(); 
ms.Write(ASCIIEncoding.UTF8.GetBytes(jsonData));  
ms.Position = 0;
client.UploadBlob(path, ms);

I have the same problem with the DataLake Gen2 APIs.
Resetting the position to 0 worked for me.

Have the same issue - about an hour saved. Thank you for this!

@seanmcc-msft
Copy link
Member

@pekspro suggested that we check the stream position in UploadAsync(), and throw an exception if Position != 0. I think this would be a good idea, to reduce the ambiguity around this situation.

@pakrym, would you be ok with this?

@pakrym
Copy link
Contributor

pakrym commented Jun 17, 2020

No. Uploading a stream starting from a arbitrary position is a totally normal practice in .net and we shouldn't artificially limit it. In addition some streams don't track their position and throw instead.

I'm also worried that it would hide a real issue we have.

@seanmcc-msft
Copy link
Member

I think we should do something to address this, the current user experience is non-optimal.

What if we check if stream.Position == stream.Length?

@pakrym
Copy link
Contributor

pakrym commented Jun 17, 2020

We should find why we hang instead of exiting the upload loop when Stream.Read returns 0.

@seanmcc-msft
Copy link
Member

This problem also occurs for small streams, that we attempt to upload in a single request. Is this an issue in other SDKs based on Core?

@pakrym
Copy link
Contributor

pakrym commented Jun 17, 2020

No others that I know of.

@pekspro
Copy link
Contributor

pekspro commented Jun 20, 2020

As I understand it, this is what happens:

In BlobRestClient there is this line:

_request.Headers.SetValue("Content-Length", 
          contentLength.ToString(System.Globalization.CultureInfo.InvariantCulture));

This setup the content-length of the request. This will be incorrect when Position in the stream is not zero.

Then in HttpClientTransport.ProcessAsync there is this line:

responseMessage = await _client.SendAsync(httpRequest, HttpCompletionOption.ResponseHeadersRead, message.CancellationToken)
                    .ConfigureAwait(false);

This will write the stream to the request. But since the expected length is incorrect it never completes.

I am sure it is more complicated than this, but this I what I found anyway :-)

I have created a PR #12901 that fixes, or at least reduces this problem.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug This issue requires a change to an existing behavior in the product in order to be resolved. Client This issue points to a problem in the data-plane of the library. customer-reported Issues that are reported by GitHub users external to the Azure organization. Service Attention Workflow: This issue is responsible by Azure service team. Storage Storage Service (Queues, Blobs, Files)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

10 participants