A library of simple utilities and syntactic sugar for commonly used functionality.
The maven artifact is available from Clojars. The Leiningen dependency should look like this: Leiningen
[boynton/clj-util "0.0.10"]
The UUID API is a set of a few simple wrappers for the UUID library written by Tatu Saloranta.
You can create uuids based on time, URLs, and names (using a UUID as the namespace). Additionally, a function uuid-to-sortable-string
function produces a string for the UUID that sorts correctly for time-based UUIDs.
(use 'util.uuid)
(let [namespace-uuid (uuid-from-url "https://github.com/boynton")
name-uuid (uuid-from-name namespace-uuid "foo")
revision-uuid (uuid-from-current-time)]
(println "url-based UUID from 'https://github.com/boynton':" namespace-uuid);
(println "using that as a namespace, the name-based UUID for 'foo' is" name-uuid)
(println "here is a time-based UUID:" revision-uuid)
(println "here is that same UUID as a sortable string:" (uuid-to-sortable-string revision-uuid)))
The util.storage
namespace provide a couple of simple storage abstractions, one for BLOB storage and one for Structured storage.
The Blob storage is defined by an interface IBlobStorage
, with methods to put, get, scan and delete Blob objects. A Blob
object is a record with the data as well as content-length, content-type, and last-modified attributes.
This interface can be implemented in Java, but a few implementations are provided in clojure: util.file
provides a simple
wrapper around a local filesystem directory, and 'util.s3' provides a wrapper around Amazon's S3.
For example:
(use 'util.storage)
(use 'util.file)
(let [store (file-blobstore "dirname")
blob1 (string-blob "This is a test" :content-type "text/plain")
blob2 (file-blob "some-file.jar")
file (java.io.File "README.md")
blob3 (blob (java.io.FileInputStream file) :content-type "text/markdown" :content-length (.length file) :last-modified (.getLastModified file))]
(put-blob store "one" blob1)
(put-blob store "two" blob2)
(put-blob store "three" blob3)
(println (get-blob store "one"))
(let [byte-array (:data (read-blob-fully (get-blob store "blob3")))]
(println "byte array retrieved with length" (count byte-array)))
(let [b (get-blob store "two")]
(write-blob-file b "some-other-file.jar")) ; in this example, the streams are copied, it isn't inmemory all at once
(loop [chunk (scan-blobs store)]
(doseq [info (:summaries chunk)]
(println info))
(let [more (:more chunk)]
(if more
(do
(println "-------")
(recur (scan-blobs store :skip more)))))))
Structed storage is aimed at smaller structured values in a key/value store. AN interface IStructStorage
is provided for this
that provides put, get, scan, and delete operations, and the value passed around is a clojure.lang.IPersistentMap
, with the
requirement that the object be serializable to JSON. The value size is limited to 64k to be compatable with a variety of
storage services.
The java implementations include util.jdbc
, which provides a generic JDBC implementation and both SQLite and H2 examples,
and a util.dynamo
implementation that provides identical functionality on Amazon's DynamoDB service.
For example:
(use 'util.storage)
(use 'util.jdbc)
(let [store (sqlite-structstore "filename")]
(put-struct store "one" {:title "This is a test" :foo [1 2 3]})
(put-struct store "two" {:title "Another one" :bar {:bletch 23}})
(println "one:" (get-struct store "one"))
(loop [chunk (scan-structs store)]
(doseq [[key struct] (:structs chunk)]
(println key "->" struct))
(and (:more chunk)
(recur (scan-blobs store :skip (:more chunk))))))
The above examples work with S3 and DynamoDB, respectively. Both require that you define your amazon credentials in
the environment, then access them with util.aws
as follows:
(use 'util.storage)
(use 'util.aws)
(use 'util.s3)
(let [s3 (s3-blobstore "my-s3-bucket-name")]
(put-blob store "one" blob1)
...
The credentials are fetched with the aws-credentials
function in util.aws.
There also is an interface to EC2 for provisioning and talking to clusters of machines in the cloud.
For example:
(use 'util.aws)
(use 'util.ec2)
(let [machines (create-cluster "myname" 3 :ami "ami-aecd60c7" :type "t1.micro" :keypair "ec2keypair" :user "ec2-user" :security "default")]
(map (fn [machine] (command machine "sudo yum -y install emacs")) machines)
(put-file (first machines) "my/local/file" "remote/file")
(get-file (second machines) "remote/file" "local/file")
...)
A more involved example is in the works.
These are minimal (as in: very primitive) utilities to deploy Bernstein's daemontools,
Apache Zookeeper, and Storm to the cloud, using the util.ec2
package.
For example, to deploy a 6 node (1 zookeeper, 1 nimbus, and 4 workers) Storm cluster to EC2:
(use 'util.storm)
(deploy-storm :workers 4)
It takes less than 15 minutes for everything to get running, initialized, deployed, and launched under supervision. It automatically
modifies your local ~/.storm/storm.yaml
file so that the local storm client connects to it. It uses a bunch of defaults which
should probably be made overridable, but it is a starting point, anyway. This is not meant to replace a more comprehensive solution
like Pallet/JClouds, but I had trouble debugging problems with that system, so I wrote this quick hack for playing around. Don't
take it too seriously :-)
The same thing works for S4, now that v0.5 makes things easier:
(use 'util.s4)
(deploy-s4 "my-cluster" :nodes 4)
(status-s4 "my-cluster")
The difference is that for s4, the cluster is dedicated to a logical s4 "cluster". Different partitioning (and node allocation) requires deploying different clusters.
By default only "us-east" AWS endpoint is supported. To change it to any other endpoint use standard AWS API tools env setting.
export EC2_URL="endpoint_here"
where "endpoint_here" is either of the following
| ec2.eu-west-1.amazonaws.com |
| ec2.sa-east-1.amazonaws.com |
| ec2.us-east-1.amazonaws.com |
| ec2.ap-northeast-1.amazonaws.com |
| ec2.us-west-2.amazonaws.com |
| ec2.us-west-1.amazonaws.com |
| ec2.ap-southeast-1.amazonaws.com |
| ec2.ap-southeast-2.amazonaws.com |
In case if the setup is behind the proxy, use "proxy.address.com" in ec2.clj to put the proper address and corresponding proxy_port number.
(defn- ec2 [cred]
(AmazonEC2Client. (BasicAWSCredentials. (:access cred) (:secret cred))
(doto (new com.amazonaws.ClientConfiguration) (.withProxyHost "proxy.address.com") (.withProxyPort proxy_port))
)
)
To deploy S4 app to any newly created cluster in AWS, use the following (case of twitter counter)
(use 'util.s4)
(deploy-s4 "my-cluster" :nodes 4 :params "s4.adapter.output.stream" :value "RawStatus" :key "key_name")
(deploy-app-s4 "my-app" :cluster "my-cluster")
(status-s4 "my-app")
Code above uses s4 utility package used for cluster deployment. S4 app deployment uses myCluster name as the target. It is possible to give parameters to the cluster now.
Original code: Lee Boynton
Updates/contribution: Sergey Boldyrev
Distributed under the Eclipse Public License, the same as Clojure.