Two examples here demonstrate how the zsv CSV parser can be compiled to web assembly and called via javascript via a static page in a browser or a Node module.
Most of the operative code is in js/foot.js which effectively just converts between Javascript and emscripten.
To run the browser demo, run make run
. Static files will be built in a
subdirectory of the build
directory, and a python local https server will be
started to serve them on https://127.0.0.1:8888
You can view a demo of the built example here
To build a node module, run make node
. Module files will be placed in
node/node_modules/zsv-parser
To run a test via node, run make test
. The node module will be built, a sample
program will be copied to node/index.js
, which reads CSV from stdin and
outputs JSON, and a test will be run
To build, you need emscripten. To run the example web server, you need python3. Unlike some of the other examples, this example does not require that libzsv is already installed
-
From the zsv base directory, run configure in your emscripten environment and save the config output to
config.emcc
:CROSS_COMPILING=yes emconfigure ./configure CONFIGFILE=config.emcc
-
Change back to this directory
examples/js
, then runemmake make run
. You should see output messages ending with:Listening on https://127.0.0.1:8888
-
Navigate to https://127.0.0.1:8888. If you get a browser warning, then using Chrome you can type "thisisunsafe" to proceed.
-
Click the button to upload a file.
Running ZSV lib from Javascript is still experimental and is not yet fully optimized. Some performance challenges are particular to web assembly + Javascript, e.g. where a lot of string data is being passed between Javascript and the library (see e.g. https://hacks.mozilla.org/2019/08/webassembly-interface-types/).
However, initial results are promising:
- Running only "count", zsv-lib is ~90%+ faster than
csv-parser
andpapaparse
- The more cell data that is fetched, the more this advantage diminishes due to
the aforementioned Javascript/wasm memory overhead. Our benchmarking suggests
that if the entire row's data is fetched, performance is about on par with
both
csv-parser
andpapaparse
. If only a portion is fetched, performance is about the same forpapaparse
, and faster thancsv-parser
(how much faster being roughly proportional to the difference between count (~90% faster) and the amount of total data fetched)
Separate commands can be used for build, run and clean:
make build
make node
make run
make clean
Add MINIFY=1
to any of the above to generate minified code.
To run benchmark tests:
make benchmark
To see all make options:
make