Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Minify library size #157

Open
agershun opened this issue May 4, 2015 · 23 comments
Open

Minify library size #157

agershun opened this issue May 4, 2015 · 23 comments

Comments

@agershun
Copy link
Member

agershun commented May 4, 2015

Current size of the library slowly grows, because we add new features. Probably, it is a time to compress the code with more radical tools than UglifyJS:

  • remove all unnecessary code parts and functions
  • manually rename long variables to one or two letters (like alasql.databases to aq.db
  • use something like Closure Compiler or other intellectual tools
  • split the library on small parts: core and additional plug-ins
@agershun
Copy link
Member Author

Latest researches:

  • Reduce size of some properties and class names can give minus 30kb
  • Closure Compiler gives minus 120kb vs Uglify

So,the library size can be reduced up to 310kb.

@agershun
Copy link
Member Author

Done:

Starting point (after uglifyJS) = 422Kb

  • Closure compiler - 80kb
  • utils/redj.js - 20kb

Current size of the library: 322kb

@mathiasrw
Copy link
Member

Anything new on getting deper into the closure flags?

Would be good for the size of the lib to get to use the force of ADVANCED_OPTIMIZATIONS https://developers.google.com/closure/compiler/docs/compilation_levels?csw=1

Are we using closure today? When I run i on 0.1.7 I get the following errors

JSC_TRAILING_COMMA: Parse error. IE8 (and below) will parse trailing commas in array and object literals incorrectly. If you are targeting newer versions of JS, set the appropriate language_in option. at line 6997 character 25 in alasql.js
            srcwherefn: returnTrue,
                         ^
JSC_PARSE_ERROR: Parse error. String continuations are not supported in this language mode. at line 7948 character 92 in alasql.js
...ypeof '+colexp+'!="undefined" && (!g[\'$$_VALUES_'+colas+'\']['+colexp+'])) \
                                                                          ^
JSC_TRAILING_COMMA: Parse error. IE8 (and below) will parse trailing commas in array and object literals incorrectly. If you are targeting newer versions of JS, set the appropriate language_in option. at line 8151 character 26 in alasql.js
                            dbenum: tcol.dbenum,
                          ^
JSC_TRAILING_COMMA: Parse error. IE8 (and below) will parse trailing commas in array and object literals incorrectly. If you are targeting newer versions of JS, set the appropriate language_in option. at line 8158 character 38 in alasql.js
                            columnid:col.as || col.columnid, 
                                      ^
JSC_TRAILING_COMMA: Parse error. IE8 (and below) will parse trailing commas in array and object literals incorrectly. If you are targeting newer versions of JS, set the appropriate language_in option. at line 8172 character 38 in alasql.js
                            columnid:col.as || col.columnid, 
                                      ^
JSC_TRAILING_COMMA: Parse error. IE8 (and below) will parse trailing commas in array and object literals incorrectly. If you are targeting newer versions of JS, set the appropriate language_in option. at line 8205 character 56 in alasql.js
                            columnid:col.as || col.columnid || col.toString(), 
                                                        ^
JSC_TRAILING_COMMA: Parse error. IE8 (and below) will parse trailing commas in array and object literals incorrectly. If you are targeting newer versions of JS, set the appropriate language_in option. at line 8228 character 56 in alasql.js
                            columnid:col.as || col.columnid || col.toString(), 
                                                        ^
JSC_PARSE_ERROR: Parse error. String continuations are not supported in this language mode. at line 13754 character 10 in alasql.js
        var s = '<html xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:x="u...
          ^
JSC_PARSE_ERROR: Parse error. String continuations are not supported in this language mode. at line 13760 character 53 in alasql.js
        s+=' <x:ExcelWorksheet><x:Name>' + sheet.sheetid + '</x:Name><x:WorksheetOp...
                                                     ^
JSC_PARSE_ERROR: Parse error. String continuations are not supported in this language mode. at line 14059 character 11 in alasql.js
        var s1 = '<?xml version="1.0"?> \
           ^
JSC_PARSE_ERROR: Parse error. String continuations are not supported in this language mode. at line 14165 character 40 in alasql.js
            s3 +='<Worksheet ss:Name="'+sheetid+'"> \
                                        ^
JSC_PARSE_ERROR: Parse error. String continuations are not supported in this language mode. at line 14168 character 8 in alasql.js
                    +'" x:FullColumns="1" \
        ^
JSC_PARSE_ERROR: Parse error. String continuations are not supported in this language mode. at line 16679 character 7 in alasql.js
            js+="');\
       ^

@agershun
Copy link
Member Author

agershun commented Jun 1, 2015

Due some problems I temporary turn off Closure Compiler until we finish
with Codex and basic documentations.

Let we spend this week with manuals and documentation (I think we need to
transfer all tests examples to documentation at least), and then turn to
minification and Cordova goals.

2015-06-01 15:12 GMT+03:00 Mathias Rangel Wulff notifications@github.com:

Anything new on getting deper into the closure flags?


Reply to this email directly or view it on GitHub
#157 (comment).

@agershun
Copy link
Member Author

agershun commented Jun 1, 2015

All of them can be solved, it simply reqires time.

Unfortunately, currently we can not use ADVANCED OPTIMIZTAION, because
Closure compiler skip one of the parser functions, so we need to
investigate this.

2015-06-01 15:12 GMT+03:00 Mathias Rangel Wulff notifications@github.com:

Anything new on getting deper into the closure flags?


Reply to this email directly or view it on GitHub
#157 (comment).

@mathiasrw
Copy link
Member

Sure - its not on the roadmap for now - just want to keep the fire going :)

Input for 300K

  • We have a lot of strings in the code. Specially chuncks of SQL "NOT BETWEEN"==this.op. We might want to look into having a dictionarry as var _NOT_BETWEEN = "NOT BETWEEN" changing the use in the code to _NOT_BETWEEN==this.op as compressed/converted will change the output to forexample s5==this.op
  • The error information string are very long (and good) - as forexample Cannot resolve column "'+this.columnid+'" because it exists in two source tables we might want to give an error code instead.

If we get more extream on size we can try to look into the following - but we must do tests to make sure it does not affect speed

  • this is in the min version 1767 times. We could check out if we have scopes where its used many times and reference it with a one letter function forexample £=this; and referensing
  • 133 times we call "throw new Error". We could call a function having the throwing string only once. Problem: We lose line number / direct link in the error to the specific line.

(thinking here if its an idea to have a production version and a dev version where the dev gives nice errors from correct line and prod only provides an error code)

(also having excel as module would remove the template strings from the core)

@agershun
Copy link
Member Author

agershun commented Jun 1, 2015

Cool!

You can look at the utils\ directory. It already have some prototype of size optimization utils for parser and other words. We can come back to them soon.

@mathiasrw
Copy link
Member

If we rename toJavaScript to toJS we gain 1kb in the min version.

@agershun
Copy link
Member Author

Good! We can call it .JS?

Отправлено с iPhone

21 июня 2015 г., в 1:54, Mathias Rangel Wulff notifications@github.com написал(а):

If we rename toJavaScript to toJS we gain 1kb in the min version.


Reply to this email directly or view it on GitHub.

@mathiasrw
Copy link
Member

sure. Ill rename .toJavaScript to .JS in src files

@agershun
Copy link
Member Author

Ok

Отправлено с iPhone

21 июня 2015 г., в 12:26, Mathias Rangel Wulff notifications@github.com написал(а):

sure. Ill rename .toJavaScript to .JS in src files


Reply to this email directly or view it on GitHub.

@mathiasrw
Copy link
Member

Renaming to .JS kept giving errors and I could not identify where the issue came from - so its .toJS until we look at it again. Please pull changes from develop.

@mathiasrw
Copy link
Member

Minor changes in the code and with language set to ECMASCRIPT5 I got closure to compile in advanced mode. Its 307.75 kB ( .min version is at the moment 437.271 kB)

Will run the tests on the code to verify if things still work

I have a feeling we need more work before tests will be OK.

If we start using closure its important with correct use of comments to document the functions as Closure uses this to typecheck https://developers.google.com/closure/compiler/docs/js-for-compiler?csw=1#types

@agershun
Copy link
Member Author

Unfortunately, this flag kills the parser. We have the option to change something in the parser code to prevent this overoptimization.

@mathiasrw
Copy link
Member

We could also have the parser as external while closure parsing and bundling it up afterwards - so we have kind of a stub in the code and then replace it after

@noid2
Copy link

noid2 commented Nov 21, 2019

Is there a way to make the full library "tree shaking"?
Example: I am using the library only to analyse complex JSON data in the client and I have to load the full 440KB of script. It would be awesome if I can extract only the code needed for a this task. This makes sence since I will not import or export files or use any other "db engine" in the background

@mathiasrw
Copy link
Member

@noid2 Nope. It wont work work treeshare because one third of the code is the parser of SQL and the other third builds costum functions that are eval'ed to use the last third of the code.

Its a bit of a bummer...

@mathiasrw
Copy link
Member

@agershun have you ever considered building the functions into strings and exporting the strings to the executable functions - so people could use only the compiled version of the functions they need?

@mathiasrw
Copy link
Member

mathiasrw commented May 12, 2020

The parser is about half the library.

It could be worth looking into using https://sap.github.io/chevrotain/

The grammar is is another format so might be a risky job to swap over. Its about 8x as fast as the JISON parser - Would be interesting to see if this actually is worth it as the parsing of sql is often not the thing that takes time.

Nice sandpit: https://sap.github.io/chevrotain/playground/

@noid2
Copy link

noid2 commented Oct 7, 2020

I spent some time studying the source code and my conclusion is that its about time to start building a new major version from scratch.
If you are willing to take this as an option, I will be more than happy to share my findings and suggestions for this approach.
Obviously this mean that I will be an active contributor to develop the next version.

ps. I come from data science (10+years) and I am not a professional developer but I have fallen in love with JavaScript for the past 4 years :)

@mathiasrw
Copy link
Member

Hi @noid2 You are welcome to come and join.

Taking the lead on next version would be very helpful. Some work have been put into converting the current base to modules - the first step to get the code modernized. If you feel an approach from scratch is helpful we are open to that, but its a lot of work.

As I see it the best aproach would be to make a tiny version only with select and simple functions that can can have more functionality added to it. The nasty part is the paser. almost half of the size - and most people dont use much of that any more.

Another aproach would be to handle alasql functions at build time - a bit like svelte - where the actual code to do the magic is generated so no parser or alasql is needed in production letting alasql basically become a codegenerator. (it is a code generator now under the hood)

What approach have you been considering?

@noid2
Copy link

noid2 commented Oct 9, 2020

Hi @mathiasrw

Its nice that you agree on a new version. The current version is really awesome and packed with features. Of course there are some bugs but it gets the job done. What I think is truly unique is the ability to use query JSON data with SQL and use SQL, AlaSQL and JS functions while selecting. That's the best of SQL and JS words combined. PostgreSQL has similar functionality but its not as natural as AlaSQL.

My idea is to have have a core high performance library that solves the core problem of querying data using SQL in the JS world. This can be achieved by first making one "standard" and following it.
This standard must define:

  1. Data structure [eg. data input/output only as JSON ( array of objects)]. If the user needs to query other data structures is should be first transformed into the standard structure using other functions which are not part of the core library. Same goes for output.
  2. SQL features Same as you mentioned in your the approach, The core library should only select data with simple functions and more functionality can be added when needed.
  3. Adapters Connection to data sources and data exports. Single file that can be imported separately.
  4. JS Ecosystem Includes Programming Paradigms, ECMAScript support, dependencies, Compatibility, Naming etc.

I know this looks overwhelming, but its not so much compare to the efforts put in maintaining the current version. Also, when I say "from scratch" it does not mean to write again every bit of code, it means to use as much as possible from existing code.

I suggest that the starting point would be to strip down the current version and remove everything except of the select feature. I have already tried this but the parser (its big, obviously) but for me is very confusing. I do not understand much of it.

Of course there are many other details to discuss.

@mathiasrw mathiasrw mentioned this issue Oct 9, 2020
@mathiasrw
Copy link
Member

@noid2 If you feel like copying your comment (or making a new one) into #1240 we can continue the dialogue there...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants