Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Timing/Buffer issue when using AWSStreamHandler #166

Closed
gitaugakwa opened this issue Jan 2, 2024 · 5 comments
Closed

Timing/Buffer issue when using AWSStreamHandler #166

gitaugakwa opened this issue Jan 2, 2024 · 5 comments
Labels
bug Something isn't working

Comments

@gitaugakwa
Copy link

gitaugakwa commented Jan 2, 2024

Current Behavior

When using the AWS Stream Handler, writes are buffered until either another write occurs or the stream ends. In the case of another write, the second write then experiences the above behaviour.

Expected Behavior

The write to be immediately dispatched and does not depend on another write or the end of the stream

Steps to Reproduce the Problem

const app: Express = express();
app.use("/test", function (req, res, next) {
        //when using text/plain it did not stream
        //without charset=utf-8, it only worked in Chrome, not Firefox
        res.setHeader("Content-Type", "text/html; charset=utf-8");
        res.setHeader("Transfer-Encoding", "chunked");

        res.write("Thinking...");
        sendAndSleep(res, 1);
        // (res as unknown as Writable).on("")
        // return;
});

function sendAndSleep(response: any, counter: number) {
        if (counter > 5) {
                response.end();
        } else {
                response.write(" ;i=" + counter);
                // response.flush();
                counter++;
                setTimeout(function () {
                        sendAndSleep(response, counter);
                }, counter * 1000);
        }
}

const expressFramework = new ExpressFramework();
const framework = new LazyFramework(expressFramework, () => app);

export const handler = ServerlessAdapter.new(null)
        .setFramework(framework)
        .setHandler(new AwsStreamHandler())
        .setResolver(new DummyResolver())
        .setLogger(createDefaultLogger({ level: "info" }))
        .addAdapter(new ApiGatewayV2Adapter())
        // more options...
        //.setFramework(new ExpressFramework())
        .build();

In the above example, you could use the interval that a user receives two numbers to determine the timing. The difference between when 2 is sent and 3 is sent over the network is 4 seconds, which should not be the case. It should be 3 seconds. This shows that 3 is in fact being written but not yet sent, then once 4 is written the 3 is sent over the network.

Environment

  • Version: 4.0.1
  • Platform: Win (local) Linux (cloud)
  • Node.js Version: v18
@gitaugakwa gitaugakwa added the bug Something isn't working label Jan 2, 2024
@H4ad
Copy link
Owner

H4ad commented Jan 2, 2024

I tested the following code:

const util = require('node:util');

exports.handler = awslambda.streamifyResponse(
    async (event, responseStream, context) => {
        function sendAndSleep(response, counter) {
            if (counter > 5) {
                    response.end();
            } else {
                    console.log('Writing ' + counter);
                    response.write(" ;i=" + counter);
                    counter++;
                    setTimeout(function () {
                            sendAndSleep(response, counter);
                    }, counter * 1000);
            }
        }
        
        const finalResponse = awslambda.HttpResponseStream.from(
          responseStream,
          { statusCode: 200, headers: { 'Content-Type': 'text/html; charset=utf-8', 'Transfer-Encoding': 'chunked' }  },
        );
        console.log(util.inspect(finalResponse, { depth: null, getters: true, showHidden: true, showProxy: true }));

        finalResponse.write("Thinking...");

        sendAndSleep(finalResponse, 1);
    }
);

I got the same behavior that my library has, which is:

  • First I got Thinking... ;i=1 ;i=2 after 5s
  • Then ;i=3 after 3s
  • Then ;i=4 after 4s
  • Then ;i=5 after 5s

From what I saw in the logs, the timings are correct:

image

You can enable the INFO RUNTIME STREAM logs by setting the following env variable: AWS_LAMBDA_RUNTIME_VERBOSE=3.

Probably the delay/timing/buffer is caused because the AWS is still trying to open the connection while we try to write to the stream.

You can open an issue on https://github.com/aws/aws-lambda-nodejs-runtime-interface-client to ask the AWS if this behavior is expected or if there is any hint/hack that we can do to speedup the connection.

@H4ad
Copy link
Owner

H4ad commented Jan 2, 2024

I think this is a bug, the following code didn't have that delay of 5s:

const util = require('node:util');

exports.handler = awslambda.streamifyResponse(
    async (event, responseStream, context) => {      
        responseStream.setContentType('text/html; charset=utf-8');
        responseStream.write("Thinking...");
        
        setTimeout(function() {
            responseStream.write("After 1s");

            responseStream.end();
        }, 1000);
    }
);

But I don't have idea why.

@H4ad
Copy link
Owner

H4ad commented Jan 2, 2024

The buffer effect is probably caused by the internal highWaterMark of AWS:

const util = require('node:util');
const HANDLER_HIGHWATERMARK = Symbol.for(
  'aws.lambda.runtime.handler.streaming.highWaterMark',
);

const handler = async (event, responseStream, context) => {
        responseStream.setContentType('text/html; charset=utf-8');

        responseStream.write("Thinking...");
        responseStream.write("a".repeat(64 * 1024));

        setTimeout(function() {
            responseStream.write("b".repeat(64 * 1024));
        }, 3000);

        setTimeout(function() {
            responseStream.write("After 6s");
            responseStream.end();
        }, 6000);
    };

handler[HANDLER_HIGHWATERMARK] = 512;

exports.handler = awslambda.streamifyResponse(handler, {
        highWaterMark: 512
    }
);

The default highWaterMark is defined as 64Kb but there is no way to override it.

@gitaugakwa
Copy link
Author

gitaugakwa commented Jan 2, 2024

Thanks @H4ad I also did a bit of testing on my end and found the issue to be aws buffering what is written to the response.

I wish there was a way to instant send out the data as soon as possible. What I've found to work is having what is written res.write() totaling to ~15,000 bytes. Which is quite an overhead for some small streams that are about 300 bytes large.

Would be nice to have the override.

@gitaugakwa
Copy link
Author

I'll close this since the new issue is more concise.

aws/aws-lambda-nodejs-runtime-interface-client#94

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants