-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[storage-blob] blobClient.download does not close network connection if the stream is not fully read #11850
Comments
Sample code... 'use strict' const express = require('express') const storageAccount ='XXXX' // Storage Account Name const storageURL = 'http://' + storageAccount +'.blob.core.windows.net' function SRCalc (req, size) { if (sr === undefined) { function azureBlobStream(req, cb) {
} function filestream (req, res) {
// stream.on('data', function (chunk) { function serveRoutes () { router.get('/video', function (req, res) { router.get('/file', function (req, res) { return router const api = express().use('/serve/', serveRoutes()) Then in browser open http://localhost:2323/serve/video and observe via netstat that a connection remains open to the blob endpoint with data "stuck" even though the stream was destroyed. |
The alternate method of stream.on('data', function (chunk) { Does close the connection, however it causes delay because it continues to read from the Azure blob stream even though the response is closed rather than terminating the request. |
Hey, @dcegress In the mentioned case, is "Response closed, destroying stream" printed?
When using stream.unpipe() and stream.destroy():
Will you be able to recieve Drain the buffer with stream.on('data', function...) with ensuring stream can get properly closed may be a good state, which ensures no memory and TCP connection leak. Thanks, |
@jeremymeng for visibility as well. |
Yes the close event triggers and with the unpipe() and destroy() we can see in the debugger the stream itself is destroyed but the network connection remains ESTABLISHED. Draining the stream with on.data allows the stream to reach its end() event and properly close down and close the network connection, however this means for a very large file we have to consume all the data unnecessarily. |
@dcegress And we will look into this. |
Sure - I ran the code via a debugger (Visual Studio Code) with a breakpoint. At the point after stream.destroy() is executed I can see that stream.destroyed returns true and any further operation on the stream is not possible. If you run the code above, open a browser to localhost:2323/serve/video, wait for the page to load then check your netstat you will see that although the Browser has closed the connection after getting the first few MB, their is an ESTABLISED connection on https to the blob store. |
There is an even simpler way to reproduce. Using the below code to stream a large file from Azure Blob to the browser in a download link, then click cancel half way through. The res is closed, but the network connection remains from node to blob. blobClient.download(streamRange.start, (streamRange.end - streamRange.start) + 1).then ( function (response) {
|
Wait a minute. Are you using our library in browser or Node.js? |
Node,js |
The comment about browser was just to illustrate the problem. Browser-> Node.js -> Azure Blob when Browser closes the connection to Node, Node does not close it to Azure. |
Got it. Recreated locally with following code. Manually GC didn't help. @jeremymeng please take a look. import { BlobClient } from "@azure/storage-blob";
import { Readable } from "stream";
import * as dotenv from "dotenv";
dotenv.config();
function scheduleGc() {
if (!global.gc) {
console.log("Garbage collection is not exposed");
return;
}
setTimeout(function () {
global.gc();
console.log("Manual gc", process.memoryUsage());
scheduleGc();
}, 10 * 1000);
}
scheduleGc();
async function main() {
const blobURLWithSAS = process.env.BLOB_URL || "";
const blobClient = new BlobClient(blobURLWithSAS);
for (let i = 0; i < 1000; i++) {
const res = await blobClient.download();
console.log(`download call done for ${i}`);
const stream: Readable = res.readableStreamBody as any;
stream.on("close", () => {
console.log(`stream ${i} closed`);
});
stream.destroy();
}
}
main(); |
@dcegress I see that you are are specifying a pipeline option @ljian3377 your repro code is using the default pipeline, so |
Yes the connections stay ESTABLISHED. It makes no difference what KeepAlive is set to. I don’t think it’s just connection re use because there is unread data in the Recv-Q of the socket. If it was just sat waiting to be used for another request that wouldn’t be the case. |
It's worth noting we have shifted the stream to use the got module instead of BlobClient and the problem doesn't happen, however we obviously have to handle a lot more of the API than we need to if we could use the modules instead. |
@dcegress you are right. The file I was downloading in my test is small so downloading finished and the connections move to CLOSE_WAIT state. Looking at our core-http code, we pass a abortSignal to node-fetch, however, after we got response back, we removed the listener. It seems an issue to me, as we would want to the ability to cancel the underlying download stream. I will investigate further. |
Logged issue #12029 and submitted a PR #12038 to fix it. with that, users can use const aborter = new AbortController();
const res = await blobClient.download(0, undefined, {
abortSignal: aborter.signal,
});
console.log(`download call done for ${i}`);
const stream = res.readableStreamBody;
stream.on("close", () => {
console.log(`stream ${i} closed`);
aborter.abort();
});
stream.on("error", (error) => {
console.log(`stream ${i} on error`);
console.log(error.message);
}); |
@jeremymeng @XiaoningLiu @jiacfan @bterlson @xirzec |
That is strange. How big is your video file - mine is about 1GB to allow for enough scrubbing before the whole file is cached |
I tried on a 250 MB video. Will try on a bigger file |
Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @xgithubtriage. Issue Details
Describe the bug blobStream.pipe(res) When the browser prematurely closes the response (as it does when HTML5 video tag is used to get the thumbnail/metadata) the stream is closed, but the underlying network connection to Azure Storage stays open and ESTABLISHED with keep alive traffic. blobStream.unpipe() and blobStream.destroy() do not cause it to be removed. Eventually the operating system runs out of TCP Memory as all these ESTABLISHED connections have date in the Recv-Q. To Reproduce Expected behavior Screenshots Additional context
|
Hi @dcegress, we deeply appreciate your input into this project. Regrettably, this issue has remained inactive for over 2 years, leading us to the decision to close it. We've implemented this policy to maintain the relevance of our issue queue and facilitate easier navigation for new contributors. If you still believe this topic requires attention, please feel free to create a new issue, referencing this one. Thank you for your understanding and ongoing support. |
Describe the bug
When opening a blobClient to a video file on Azure Blob Storage and using pipe() to stream it to the response with
blobStream.pipe(res)
When the browser prematurely closes the response (as it does when HTML5 video tag is used to get the thumbnail/metadata) the stream is closed, but the underlying network connection to Azure Storage stays open and ESTABLISHED with keep alive traffic.
blobStream.unpipe() and blobStream.destroy() do not cause it to be removed.
Eventually the operating system runs out of TCP Memory as all these ESTABLISHED connections have date in the Recv-Q.
To Reproduce
Steps to reproduce the behavior:
1.
Expected behavior
A clear and concise description of what you expected to happen.
Screenshots
If applicable, add screenshots to help explain your problem.
Additional context
Add any other context about the problem here.
The text was updated successfully, but these errors were encountered: