Skip to content
forked from cnuernber/charred

zero dependency efficient read/write of json and csv data.

License

Notifications You must be signed in to change notification settings

ryantate/charred

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

37 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Charred

Efficient character-based file parsing for csv and json formats.

Clojars Project

  • Zero dependencies.

  • As fast as univocity or jackson.

  • Same API as clojure.data.csv and clojure.data.json implemented far more efficiently.

  • API Documentation

Usage

user> (require '[charred.api :as charred])
nil
user> (charred/read-json "{\"a\": 1, \"b\": 2}")
{"a" 1, "b" 2}
user> (charred/read-json "{\"a\": 1, \"b\": 2}" :key-fn keyword)
{:a 1, :b 2}
user> (println (charred/write-json-str *1))
{
  "a": 1,
  "b": 2
}

A Note About Efficiency

If you are reading or writing a lot of small JSON objects the best option is to create a specialized parse fn to exactly the options that you need and pass in strings or char[] data. A similar pathway exists for high performance writing of json objects. The returned functions are safe to use in multithreaded contexts.

The system is overall tuned for large files. Small files or input streams should be setup with :async? false and smaller :bufsize arguments such as 8192 as there is no gain for async loading when the file/stream is smaller than 1MB. For smaller streams slurping into strings in an offline threadpool will lead to the highest performance. For a particular file size if you know you are going to parse many of these then you should gridsearch :bufsize and :async? as that is a tuning pathway that I haven't put a ton of time into. In general the system is tuned towards larger files as that is when performance really does matter.

All the parsing systems have mutable options. These can be somewhat faster and it is interesting to look at the tradeoffs involved. Parsing a csv using the raw supplier interface is a bit faster than using the Clojure sequence pathway into persistent vectors and it probably doesn't really change your consume pathway so it may be worth trying it.

Development

Before running a REPL you must compile the java files into target/classes. This directory will then be on your classpath.

scripts/compile

Tests can be run with scripts/run-tests which will compile the java and then run the tests.

License

MIT license.

About

zero dependency efficient read/write of json and csv data.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Clojure 68.7%
  • Java 31.1%
  • Shell 0.2%