Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

browserify not working for node apps #696

Closed
doapp-ryanp opened this issue Aug 28, 2015 · 20 comments
Closed

browserify not working for node apps #696

doapp-ryanp opened this issue Aug 28, 2015 · 20 comments
Labels
closing-soon This issue will automatically close in 4 days unless further comments are made. guidance Question that needs advice or information.

Comments

@doapp-ryanp
Copy link

Trying to browserify my node code for use in lambda to help combat the cold start issues (which also raises the ceiling on the hard 50MB zipped limit).

Is browserfying for node not supported? Ultamately I'd like to just use the --node browserify option, but its not working.

Ex: (I wont be hard coding creds, this is just for ease of reproduce.)

var AWS = require('aws-sdk');
AWS.config.update({
  accessKeyId: 'mykey',
  secretAccessKey: 'mysecret'
});
var s3 = new AWS.S3();
s3.listBuckets(function(err, data) {
  console.log(err, data);
});

If I run browserify bundletests.js > bundled.js then run node bundle.js I get XMLHttpRequest is not defined which is expected since I'm not running in the browser.

So I run browserify bundletests.js --dg --no-builtins --no-commondir > bundled.js to not use the builtins and I get

module.js:340
    throw err;
          ^
Error: Cannot find module '_process'
    at Function.Module._resolveFilename (module.js:338:15)
    at Function.Module._load (module.js:280:25)
    at Module.require (module.js:364:17)
    at require (module.js:380:17)
    at s (/Users/ryan/projects/JAWS/tests/bundle/testapp/myLambda/bundled.js:1:176)

Any ideas?

@AdityaManohar
Copy link
Contributor

@doapp-ryanp I'm still a little unclear on how you intend to use your browserify-ed code inside of AWS Lambda. Lambda requires the function to export a top level CommonJS module.

That being said, browserify-ing the SDK for use in Node.js is currently not supported, because we rely on the window object to hang the AWS namespace on. There might be other potential issues like the one you are running into.

Would you be able to explain a use case a little more? I'm curious about why uploading an archive of Node.js modules is not working well for you.

@doapp-ryanp
Copy link
Author

@AdityaManohar I have broswerfied code that is working just fine in lambda. You can export top level CommonJS method. The bummer is the only node module I have not gotten to work so far is aws-sdk.

If your curious on how I'm doing this, take a look at the JAWS code I wrote.

I wan't to be able to bundle up aws-sdk so I'm not restricted to the version that is included in lambda (2.1.35 is already backlevel). Because its not browserify compatible, I have to .exclude('aws-sdk') and use the external 2.1.35 version thats included in lambda. If its not obvious why this is problematic let me know and I will elaborate.

As for the use case for bundling - first some background:

I'd like to start out by saying I love lambda+api gateway. It is a game changer. With that said the cold start lag times make it un-useable as a production webservice/api layer. I define cold start as: 1st call after initial creation (brand new lambda, or contaner that has been provisioned due to load) OR 1st call after being frozen from inactivity (currently ~5mins).

I know this is a extremely difficult problem to solve at scale (and the problem is worse in Java lambdas currently). I (and co-workers/friends) have had discussions with lambda and api gateway engineers and I know they are working on improving the cold start issues. I'm aware this issue is not the forum to address the problem - but you must understand the background in order to understand my use case.

With this in mind, the nature of the product (containers launching + grabbing code, containers being unfrozen etc) it is of paramount importance that the following conditions are optimized as much as possible to cut down on cold start times:

  1. code size. Smaller size, smaller compressed file, faster code gets from storage into the container. Less code faster the VM starts.
  2. Removal of all code that is note directly related to processing the event. [Best practices] from AWS docs (http://docs.aws.amazon.com/lambda/latest/dg/best-practices.html) concurs with this.

So my use case is to leverage build/bundle tooling like browserify/systemjs + uglifyier/minifier.

I want lambda+api gateway to be a successful production grade product. With AWS's track record I think it will become just this. Cutting every ms of lag time off that can be done with a reasonable amount of work, will go a long way in making it successful. IMO most lambda code that uses the aws-sdk will only be using a fraction of its codbase - supporting a popular tool like browserify to remove the un-used code is a reasonable request.

I know its more work, but with AWS's investment in lambda, maybe now is the time to invest more resources in the aws-sdk to separate it into sdk for browser and sdk for node. My hunch is this will not only make it easier for tools like browserify/systemjs to work, but it will also allow for some optimizations that will improve its efficiency in lambda.

Sorry this was so long winded, hard to convey this problem and use case concisely

@AdityaManohar
Copy link
Contributor

@doapp-ryanp Thanks for the explanation! This definitely helps understand your use case better.

I did some investigation and I was able to reproduce the problem with a rather trivial use case:

// log.js

var log = function() {
  console.log(process.memoryUsage());
};

module.exports = log;
// source.js
var log = require('./log');
log();

Browserify-ing the source file still throws the same error

$> browserify source.js --dg --no-builtins --no-commondir > bundle.js
module.js:338
    throw err;
          ^
Error: Cannot find module '_process'
    at Function.Module._resolveFilename (module.js:336:15)
    at Function.Module._load (module.js:278:25)
    at Module.require (module.js:365:17)
    at require (module.js:384:17)
    at s (/Users/aditm/dev/code/bundle.js:1:176)
    at /Users/aditm/dev/code/bundle.js:1:367
    at Object._process (/Users/aditm/dev/code/bundle.js:16:14)
    at s (/Users/aditm/dev/code/bundle.js:1:316)
    at /Users/aditm/dev/code/bundle.js:1:367
    at Object../log (/Users/aditm/dev/code/bundle.js:2:11)

It also looks like we only depend on the window object when it is available. This leads me to believe that there shouldn't be a problem when you try to use the browserify-ed SDK in Node. Presumably there is somethings else going on with the way browserify handles globals.

Hope this helps!

@doapp-ryanp
Copy link
Author

Nope it does not work. In my continued digging, turns out there is a way to insert global vars, which gets around the cant find _process module error.

However this exposes another problem. api_loader.js does require()s using __dirname.

Not quite sure why __dirname is needed here (vs using rel paths) but ignoring that., according to #383 it looks like you have some special code that runs for your hosted aws sdk for browser. But later in that issue @lsegal says that as of v2.0.20 you no longer need to do this hack.

I asked in that issue a week or so ago for some clarification. What am I missing here?

Here is my browserify code:

browserify({
        basedir: baseDir,
        entries: entries,
        standalone: 'lambda',

        //setup for node app (copy logic of --node in bin/args.js)
        browserField: false,
        builtins: false,
        commondir: false,
        detectGlobals: true,  //default for bare in cli is true, but we dont care if its slower

        //handle process https://github.com/substack/node-browserify/issues/1277
        insertGlobalVars: {
          //__filename: insertGlobals.vars.__filename,
          //__dirname: insertGlobals.vars.__dirname,
          //process: insertGlobals.vars.process,
          process: function() {
            return;
          },
        },
      })

Here is output from lambda (as expected __dirname is replaced during browserify):

{
  "errorMessage": "Cannot find module '/Users/ryan/jawstests/my-project/back/node_modules/aws-sdk/apis/metadata.json'",
  "errorType": "Error",
  "stackTrace": [
    "Function.Module._resolveFilename (module.js:338:15)",
    "Function.Module._load (module.js:280:25)",
    "Module.require (module.js:364:17)",
    "require (module.js:380:17)",
    "i (/var/task/main.js:1:481)",
    "/var/task/main.js:1:672",
    "a (/var/task/main.js:1:14017)",
    "Object.i [as services] (/var/task/main.js:1:14473)",
    "Object.<anonymous> (/var/task/main.js:3:18405)",
    "Object../api_loader (/var/task/main.js:3:18590)"
  ]
}

@AdityaManohar
Copy link
Contributor

I asked in that issue a week or so ago for some clarification. What am I missing here?

The AWS_SERVICES environment variable is used to select the services that you want to build in the browserify-ed distributable of the SDK. You need to set it only when you want to cherry-pick the services you are including support for. If you don't set this environment variable then it will include support for all the services that support CORS.

I hope this helps clarify things!

@doapp-ryanp
Copy link
Author

thanks that helps clarify how AWS_SERVICES is used. I want them all and I want to let browserify remove code that is not being used in my code path.

What needs to be done for browserify to correctly include metadata.json (and other json) files in apis? The docs for building via browserify do not work for the AWS SDK for node.

@lsegal
Copy link
Contributor

lsegal commented Sep 1, 2015

@doapp-ryanp

I want them all and I want to let browserify remove code that is not being used in my code path.

The problem here is that, due to the way the SDK is loaded (you just require('aws-sdk')), that's not possible. If browserify were to inspect your codepath based on that require call, it would be pulling in the entire SDK, regardless of which service classes you were using. That's why browserifying for Node.js does not really provide any advantage without AWS_SERVICES. Similarly, if we were to correctly parse the metadata file the same way we do in browser environments, we would be pulling in all services.

Is size on disk correlated to cold start time? It's worth noting that although there's a bit of data on disk, that data is not actually loaded until a service is constructed, so it does not impact require() time.

@lsegal
Copy link
Contributor

lsegal commented Sep 1, 2015

@doapp-ryanp also, if this really is an issue going forward, you might want to consider relying on the system-wide install of aws-sdk in Lambda, since that's always present and never requires an install-- and is kept fairly up-to-date. It might be worth elaborating why this is a blocker for you; the Lambda team would certainly be interested in knowing.

@doapp-ryanp
Copy link
Author

@lsegal thanks for the clear explanation.

Is size on disk correlated to cold start time

This does indeed seem to be the case. Ignoring runtime all together, larger the file the longer it takes to get into the container that Lambda creates. Leads to slower initial provision and provisioning of said container when scaling out. I know we are only talking about 500k here zipped for the aws-sdk, but since most npm mods work with browserify we figured it was a reasonable optimization step to run to shave every ms off of lambda start times that we could.

It might be worth elaborating why this is a blocker for you

Relying on the system-wide install of the aws-sdk, and the single nodejs version currently supported in lambda is a blocker because:

  • Security. If there is a security bug in aws-sdk 2.1.35 (current at time of writing) we are at the mercy of how fast AWS can update the code.
  • Backwards incompatibility. I see that the Node.js runtime in Lambda was just increased from 0.10.33 to 0.10.36. What if there was a backwards incompatible change? Same thing for aws-sdk for node. I create a lambda today that is coded against the system wide 2.1.35. 3 months from now the one and only available system-wide install goes to 2.2. I now have to upgrade/migrate my code to support the 2.2 breaking changes? Java solves this by shipping every previous runtime in current release. You then target your java runtime version.
  • Inability to leverage the fast moving ecosystem. Node.js, as you well know, is relatively young. Its evolving at a blistering pace. No more evident then the 4.0 release coming this month which brings in much needed ES6 features. I know you do not work on Lambda, but IMO for it to be successful in the Node.js space, it will have to stay in sync with the pace of he Node ecosystem. Same goes for aws-sdk for js. For example, the non-browser side of aws-sdk for js can migrate to ES6 quicker. As soon as it does, I will want to use it right away and not wait on the time it takes for the system-wide Lambda to adopt it. Specifically, two of the many reasons, is removal of polyfill cruft and things that are now native in ES6 like Promises.

I created this forum post back in May on this subject. Have not been given a clear direction on how Lambda team is going to handle these issues.

@lsegal
Copy link
Contributor

lsegal commented Sep 1, 2015

I can't respond to any of these with authority (you might want to ping your forum thread and get an official response from the current SDK maintainers), but from experience I can say that:

Security: if there was some kind of security patch going out with the SDK it's extremely likely that this release would be synchronized with Lambda's update of their version. AWS takes security patches very seriously, you're unlikely to see a significant delay when it comes to this stuff.

Backwards compatibility: The SDK follows Semantic Versioning. Going from 2.1 to 2.2 is not an incompatible update; only 2.x to 3.x. Lambda would NOT upgrade their SDKs from 2.x to 3.x if it involved breaking changes; that would break all of their existing users. I know from experience that AWS takes backwards compatibility very seriously, and so does this SDK.

For the rest you might want to get an official answer. Hope that helps!

@doapp-ryanp
Copy link
Author

Thanks for the prompt response.

Re: backwards compat: As it stands today, lambda only supports ONE version of aws-sdk and one version of Node.js. My point was more that this architecture decision in a node runtime is significantly flawed. I will indeed continue to try to take it up with them, however it is also in your best interest for them to support multiple because as it stands now you will never be able to release a v3.

This same logic applies for node.js runtime. It will cripple the ability for the product to move forward at the speed the node ecosystem requires.

Just for completeness sake, I am aware I can bundle my own (non browserified) version of aws-sdk and I can even bundle my own version of node/io.js. Both will increase the bloat (discussed before) and bundling the runtime as it stands today is a hack and has a CGI style interface (wont go into that discussion here).

@chrisradek chrisradek added Question service-api This issue is due to a problem in a service API, not the SDK implementation. labels Oct 26, 2015
@LiuJoyceC
Copy link
Contributor

Node.js 4.3 is now available in Lambda so you can leverage ES6 features in your Lambda functions, and Lambda does update its default version of the aws-sdk on a regular basis. While this does not resolve SDK's incompatibility with browserify --node, this should at least ameliorate the issue of "inability to leverage the fast moving ecosystem". Hope this helps!

@LiuJoyceC LiuJoyceC added closing-soon This issue will automatically close in 4 days unless further comments are made. and removed Question service-api This issue is due to a problem in a service API, not the SDK implementation. labels Apr 8, 2016
@LiuJoyceC
Copy link
Contributor

If you still would like to bundle the latest version of the SDK to use in Lambda and not use the default version of the SDK pre-installed in Lambda, you can zip your directory that contains both the code for your Lambda function and a node_modules folder that contains your desired version of the SDK. Then your require('aws-sdk') in your code should reference the aws-sdk module in your zip rather than the the pre-installed version in Lambda.
For more info: http://docs.aws.amazon.com/lambda/latest/dg/nodejs-create-deployment-pkg.html

If file size is a concern, there are folders and files you can delete from aws-sdk before you zip it. In the root aws-sdk folder, you can safely delete the dist, dist-tools, and scripts folders without affecting your code. If you are only using a few AWS services in your code, you can also delete most of the files in apis as well. However, if you delete the api files for a service (or a version of a service) you don't use, be sure to remove the service or version from apis/metadata.json, or an error will be thrown when you run your code. If you are not making use of waiters or paginators in your code, you can also delete those files in apis. Deleting these folders and files will cut down the SDK's size by over 4MB (the entire SDK is over 7MB).

@chrisradek
Copy link
Contributor

@doapp-ryanp
PR #1123 adds support for using browserify or webpack with node projects. It also lets you require individual services in your code.

Here's an example of a quick and dirty module:

// handler.js
var S3 = require('aws-sdk/clients/s3');

var handler = function() {
    var s3 = new S3();
    s3.headObject({
        Bucket: 'BUCKET',
        Key: 'KEY'
    }, function(err, data) {
        console.log(err, data);
    });
};

module.exports = handler;

Running browserify --node --standalone='lambda' handler.js -o bundle.js created a bundle that is significantly less than the size of the full SDK (in this example, less than 700 KB). That's without minifying the code as well. The same can also be done using webpack.

Let me know if this helps with your development! I'll update again once this feature is in master.

@doapp-ryanp
Copy link
Author

@chrisradek this is awesome!! This is a big step forward for NodeJS based Lambda's. Thank you for not forgetting about this issue.

What version release of the SDK do you anticipate this making it into?

@ac360 @flomotlik @eahefnawy heads up ^

chrisradek added a commit that referenced this issue Sep 9, 2016
@chrisradek
Copy link
Contributor

The PR was just merged and released as part of 2.6.0. We'll work on updating docs next, but you should be able to start taking advantage of this functionality now.

@doapp-ryanp
Copy link
Author

sweet. appreciate you getting it in a release so quick. I'll roll this into one of my projects here soon (week or 2) and report back.

@doapp-ryanp
Copy link
Author

@chrisradek sorry it took so long for me to get to this, but I have just tested the aws-sdk for JS against browserify using this new componentized functionality and it works AWESOME.

It is producing super small code sizes. This functionality will now allow me to stay up to date with the most recent aws-sdk-js without paying a performance penalty.

Great work!!

ps - this issue can be closed IMO

@chrisradek
Copy link
Contributor

@doapp-ryanp
Glad to hear this change is working out for you!

@lock
Copy link

lock bot commented Sep 29, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs and link to relevant comments in this thread.

@lock lock bot locked as resolved and limited conversation to collaborators Sep 29, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
closing-soon This issue will automatically close in 4 days unless further comments are made. guidance Question that needs advice or information.
Projects
None yet
Development

No branches or pull requests

6 participants