clj-util

A library of simple utilities and syntactic sugar for commonly used functionality.

Usage

Installation

The maven artifact is available from Clojars. The Leiningen dependency should look like this: Leiningen

[boynton/clj-util "0.0.10"]

Using UUIDs

The UUID API is a set of a few simple wrappers for the UUID library written by Tatu Saloranta. You can create uuids based on time, URLs, and names (using a UUID as the namespace). Additionally, a function uuid-to-sortable-string function produces a string for the UUID that sorts correctly for time-based UUIDs.

(use 'util.uuid)

(let [namespace-uuid (uuid-from-url "https://github.com/boynton")
      name-uuid (uuid-from-name namespace-uuid "foo")
      revision-uuid (uuid-from-current-time)]
  (println "url-based UUID from 'https://github.com/boynton':" namespace-uuid);                                                                              
  (println "using that as a namespace, the name-based UUID for 'foo' is" name-uuid)
  (println "here is a time-based UUID:" revision-uuid)
  (println "here is that same UUID as a sortable string:" (uuid-to-sortable-string revision-uuid)))

Simple Storage Abstractions

The util.storage namespace provide a couple of simple storage abstractions, one for BLOB storage and one for Structured storage.

BlobStorage

The Blob storage is defined by an interface IBlobStorage, with methods to put, get, scan and delete Blob objects. A Blob object is a record with the data as well as content-length, content-type, and last-modified attributes.

This interface can be implemented in Java, but a few implementations are provided in clojure: util.file provides a simple wrapper around a local filesystem directory, and 'util.s3' provides a wrapper around Amazon's S3.

For example:

(use 'util.storage)
(use 'util.file)

(let [store (file-blobstore "dirname")
      blob1 (string-blob "This is a test" :content-type "text/plain")
      blob2 (file-blob "some-file.jar")
      file (java.io.File "README.md")
      blob3 (blob (java.io.FileInputStream file) :content-type "text/markdown" :content-length (.length file) :last-modified (.getLastModified file))]
   (put-blob store "one" blob1)
   (put-blob store "two" blob2)
   (put-blob store "three" blob3)
   (println (get-blob store "one"))
   (let [byte-array (:data (read-blob-fully (get-blob store "blob3")))]
      (println "byte array retrieved with length" (count byte-array)))
   (let [b (get-blob store "two")]
        (write-blob-file b "some-other-file.jar")) ; in this example, the streams are copied, it isn't inmemory all at once
   (loop [chunk (scan-blobs store)]
      (doseq [info (:summaries chunk)]
             (println info))
      (let [more (:more chunk)]
         (if more
           (do
                (println "-------")
                (recur (scan-blobs store :skip more)))))))

StructStorage

Structed storage is aimed at smaller structured values in a key/value store. AN interface IStructStorage is provided for this that provides put, get, scan, and delete operations, and the value passed around is a clojure.lang.IPersistentMap, with the requirement that the object be serializable to JSON. The value size is limited to 64k to be compatable with a variety of storage services.

The java implementations include util.jdbc, which provides a generic JDBC implementation and both SQLite and H2 examples, and a util.dynamo implementation that provides identical functionality on Amazon's DynamoDB service.

For example:

(use 'util.storage)
(use 'util.jdbc)

(let [store (sqlite-structstore "filename")]
     (put-struct store "one" {:title "This is a test" :foo [1 2 3]})
     (put-struct store "two" {:title "Another one" :bar {:bletch 23}})
     (println "one:" (get-struct store "one"))
   (loop [chunk (scan-structs store)]
      (doseq [[key struct] (:structs chunk)]
             (println key "->" struct))
      (and (:more chunk)
           (recur (scan-blobs store :skip (:more chunk))))))

AWS services

The above examples work with S3 and DynamoDB, respectively. Both require that you define your amazon credentials in the environment, then access them with util.aws as follows:

(use 'util.storage)
(use 'util.aws)
(use 'util.s3)

(let [s3 (s3-blobstore "my-s3-bucket-name")]
   (put-blob store "one" blob1)
   ...

The credentials are fetched with the aws-credentials function in util.aws.

EC2

There also is an interface to EC2 for provisioning and talking to clusters of machines in the cloud.

For example:

(use 'util.aws)
(use 'util.ec2)

(let [machines (create-cluster "myname" 3 :ami "ami-aecd60c7" :type "t1.micro" :keypair  "ec2keypair" :user "ec2-user" :security "default")]
     (map (fn [machine] (command machine "sudo yum -y install emacs")) machines)
     (put-file (first machines) "my/local/file" "remote/file")
     (get-file (second machines) "remote/file" "local/file")
     ...)

A more involved example is in the works.

Supervise, Zookeeper, Storm, S4

These are minimal (as in: very primitive) utilities to deploy Bernstein's daemontools, Apache Zookeeper, and Storm to the cloud, using the util.ec2 package.

For example, to deploy a 6 node (1 zookeeper, 1 nimbus, and 4 workers) Storm cluster to EC2:

(use 'util.storm)

(deploy-storm :workers 4)

It takes less than 15 minutes for everything to get running, initialized, deployed, and launched under supervision. It automatically modifies your local ~/.storm/storm.yaml file so that the local storm client connects to it. It uses a bunch of defaults which should probably be made overridable, but it is a starting point, anyway. This is not meant to replace a more comprehensive solution like Pallet/JClouds, but I had trouble debugging problems with that system, so I wrote this quick hack for playing around. Don't take it too seriously :-)

The same thing works for S4, now that v0.5 makes things easier:

(use 'util.s4)

(deploy-s4 "my-cluster" :nodes 4)

(status-s4 "my-cluster")

The difference is that for s4, the cluster is dedicated to a logical s4 "cluster". Different partitioning (and node allocation) requires deploying different clusters.

AWS, endpoints and S4

By default only "us-east" AWS endpoint is supported. To change it to any other endpoint use standard AWS API tools env setting.

export EC2_URL="endpoint_here"

where "endpoint_here" is either of the following

| ec2.eu-west-1.amazonaws.com      |
| ec2.sa-east-1.amazonaws.com      |
| ec2.us-east-1.amazonaws.com      |
| ec2.ap-northeast-1.amazonaws.com |
| ec2.us-west-2.amazonaws.com      |
| ec2.us-west-1.amazonaws.com      |
| ec2.ap-southeast-1.amazonaws.com |
| ec2.ap-southeast-2.amazonaws.com |

S4 behind proxy

In case if the setup is behind the proxy, use "proxy.address.com" in ec2.clj to put the proper address and corresponding proxy_port number.

(defn- ec2 [cred]
  (AmazonEC2Client. (BasicAWSCredentials. (:access cred) (:secret cred))
                    (doto (new com.amazonaws.ClientConfiguration) (.withProxyHost "proxy.address.com") (.withProxyPort proxy_port))
  )
)

S4 apps deployment, parameters passing

To deploy S4 app to any newly created cluster in AWS, use the following (case of twitter counter)

(use 'util.s4)

(deploy-s4 "my-cluster" :nodes 4 :params "s4.adapter.output.stream" :value "RawStatus" :key "key_name")

(deploy-app-s4 "my-app" :cluster "my-cluster")

(status-s4 "my-app")

Code above uses s4 utility package used for cluster deployment. S4 app deployment uses myCluster name as the target. It is possible to give parameters to the cluster now.

License

Original code: Lee Boynton

Updates/contribution: Sergey Boldyrev

Distributed under the Eclipse Public License, the same as Clojure.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

clj-util

Usage

Installation

Using UUIDs

Simple Storage Abstractions

BlobStorage

StructStorage

AWS services

EC2

Supervise, Zookeeper, Storm, S4

AWS, endpoints and S4

S4 behind proxy

S4 apps deployment, parameters passing

License

Files

README.md

Latest commit

History

README.md

File metadata and controls

clj-util

Usage

Installation

Using UUIDs

Simple Storage Abstractions

BlobStorage

StructStorage

AWS services

EC2

Supervise, Zookeeper, Storm, S4

AWS, endpoints and S4

S4 behind proxy

S4 apps deployment, parameters passing

License