Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Boot lein uberjar equivalent - performance challenges #94

Closed
rundis opened this issue Jan 18, 2015 · 8 comments
Closed

Boot lein uberjar equivalent - performance challenges #94

rundis opened this issue Jan 18, 2015 · 8 comments
Milestone

Comments

@rundis
Copy link

rundis commented Jan 18, 2015

I created a boot port of my lein build.
It worked (after I build boot from source and ran lein install).

In summary:
On my mac (mavericks) with jdk1.8.0_25:

  • With leiningen it takes approx 12-15 secs to create an uberjar for my project
  • With boot it takes approx 1:20-1:40 minutes to create an equivalent uberjar

build.boot

(set-env!
  :resource-paths #{"src"}
  :repositories '[["central" "https://repo1.maven.org/maven2/"]
                  ["clojars" "http://clojars.org/repo"]
                  ["Animalia nexus" "http://62.89.42.8:8082/nexus/content/groups/public"]]
  :dependencies '[[org.clojure/clojure "1.6.0"]
                 [compojure "1.2.1"]
                 [liberator "0.12.2"]
                 [ring/ring-jetty-adapter "1.3.1"]
                 [ring/ring-json "0.3.1"]
                 [bouncer "0.3.1"]
                 [org.clojure/java.jdbc "0.3.6"]
                 [com.oracle/ojdbc6 "11.2.0.4"]
                 [drift "1.5.2"]
                 [hikari-cp "0.11.1"]
                 [org.slf4j/slf4j-simple "1.7.7"]
                 [jarohen/nomad "0.7.0"]
                 [buddy "0.2.3"]])

(task-options!
 pom {:project 'animalia-autentisering
       :version "0.1.0"}
 aot {:namespace '#{animalia-autentisering.core}}
 jar {:main 'animalia_autentisering.core
       :manifest {"Description" "Autentiserings tjenester for animalia tjenester/applikasjoner"
                  "Url" "https://github.com/animalia/animalia-autentisering"}})


(deftask build
  "Build my project."
  []
  (comp 
   (aot)
   (pom)
   (uber)
   (jar)))

Running a leiningen build yields:

date +"%Y-%m-%d %H:%M:%S" &&  lein ring uberjar |  gawk '{ print strftime("%Y-%m-%d %H:%M:%S"), $0; fflush(); }' && date +"%Y-%m-%d %H:%M:%S"
2015-01-18 11:57:44
2015-01-18 11:57:54 Warning: specified :main without including it in :aot.
2015-01-18 11:57:54 Implicit AOT of :main will be removed in Leiningen 3.0.0.
2015-01-18 11:57:54 If you only need AOT for your uberjar, consider adding :aot :all into your
2015-01-18 11:57:54 :uberjar profile instead.
2015-01-18 11:57:56 Created /Users/mrundberget/projects/animalia/helsegris/animalia-autentisering/target/animalia-autentisering-0.1.0.jar
2015-01-18 11:58:00 Created /Users/mrundberget/projects/animalia/helsegris/animalia-autentisering/target/animalia-autentisering-0.1.0-standalone.jar
2015-01-18 11:58:00

Running with boot yields:

date +"%Y-%m-%d %H:%M:%S" &&  boot -v build 2>&1 >/dev/null |  gawk '{ print strftime("%Y-%m-%d %H:%M:%S"), $0; fflush(); }' && date +"%Y-%m-%d %H:%M:%S"
2015-01-18 11:54:41
2015-01-18 11:54:44 registering src [:create :modify]
2015-01-18 11:54:44 registering src/animalia_autentisering [:create :modify]
2015-01-18 11:54:44 registering src/config [:create :modify]
2015-01-18 11:54:44 registering src/db [:create :modify]
2015-01-18 11:54:44 registering src/migrations [:create :modify]
2015-01-18 11:54:44 sending change event
2015-01-18 11:54:48 Compiling animalia-autentisering.core...
2015-01-18 11:55:08 Writing pom.xml and pom.properties...
2015-01-18 11:55:13 Adding uberjar entries...
2015-01-18 11:55:36 Writing animalia-autentisering-0.1.0.jar...
2015-01-18 11:56:09

A couple of other thoughts:

  • A bit unconventional logging to stderr ?
  • When you get a compilation error, its virtually impossible to figure out from the exception what went wrong (due to temporary files usage ?)
clojure.lang.ExceptionInfo: java.lang.IllegalArgumentException: Don't know how to create ISeq from: clojure.lang.Keyword
    data: {:file
           "../var/folders/c7/_w_4f1617dvbdkcx17ppnpfw0000gn/T/boot.user6212888234702715304.clj",
           :line 11}

Anyways, Boot looks really exciting ! Keep up the good work :)

@Deraen
Copy link
Contributor

Deraen commented Jan 18, 2015

I wonder if leiningen is using previously compiled files if they exist. Could you test leiningen again after running lein clean?

@rundis
Copy link
Author

rundis commented Jan 18, 2015

Timed both several times with rm -rf target as precondition. Should have mentioned.

@alandipert alandipert added this to the 2.0.0 milestone Jan 18, 2015
@alandipert
Copy link
Contributor

Hi, thank you for the great analysis, and for the kind words!

For debugging your build.boot, check out boot -b. This emits the code that will run in boot.user as Clojure first sees it, and is identical to the file you see referenced in that ExceptionInfo message. The reason your build.boot isn't run directly is because we prepend ~/.profile.boot if it exists so that one can easily share e.g. REPL configuration between projects.

Re: your build.boot, this is a minor enhancement but you can actually add repositories by putting a function as the value for any key including :repositories. For instance:

:repositories #(conj % '["Animalia nexus" "http://62.89.42.8:8082/nexus/content/groups/public"])

OK now for the less-than-good news. Boot represents filesystems using an immutable data structure we call the Fileset. Fileset values are what handlers are passed, and also what they pass to the next handler. At the end of the build we emit the fileset value we got from the last handler into the target directory. Filesets are maybe the most important abstraction in boot, because they prevent tasks from having to coordinate around places on disk, which is effectively a shared/global namespace. Instead, because filesets are values, tasks can compose using normal functional/programming means, and tasks don't need to know or care about the shape of the "real" filesystem at all.

Right, getting to the bad news. So nothing we've seen so far taxes the fileset idea quite like the uberjar. The fileset was originally conceived to work with sources -- code being actively edited by the user of boot in a development session -- not exploded dependencies that don't change between build invocations. Unfortunately, even the smallest uberjar projects can have tens of thousands of files in the fileset depending on their dependencies.

There are definitely avenues for improving Fileset performance left to explore, and after exploring them we might be nearly as fast as lein. It's possible (but would suck) that uberjar is a special thing, and we need to special-case it somehow. We'll be looking into it, but help improving uberjar before 2.0.0 would be graciously accepted.

In the meantime there are some workarounds I do personally and can imagine. At work we bypass uberjar completely by making boot the entrypoint for our webapps. We deploy our apps to AWS ElasticBeanstalk using a Dockerfile like this one and this boot task. This pawns the work of downloading the dependencies off on the app container, and makes for faster deployments because we only have to upload the repo source, not all dependencies. Tasks configure themselves in the container based on environment variables. That Dockerfile is actually used to launch 2 different SQS workers and a daemon, each with a different startup function. Which function to run is determined by the config task, which looks at the COMPONENT environment variable. We set the variable to something different for each target environment.

Another workaround I can imagine is a task that emits a project.clj and runs lein uberjar for you.

@martinklepsch
Copy link
Member

Are you passing the same JVM OPTS to boot that you pass to leiningen? (Might be still slow)
https://github.com/boot-clj/boot/wiki/JVM-Options

@rundis
Copy link
Author

rundis commented Jan 18, 2015

@martinklepsch I haven't set any specific jvm options for lein. I tried tinkering with BOOT_JVM_OPTIONS as per your wiki, but with minimal impact (perhaps slightly faster startup, but thats just drops in the sea compared to the elapsed time for compiling, uber and jaring.

@alandipert Thx for your elaborate answer. I'd love to contribute, but I'm just a few months into working with clojure in anger. It'll take some time before I'll be able to contribute anything really meaningful I suppose.

I've previously worked on porting large builds from maven to gradle. I see some parallels to maven->gradle with lein->boot. I see a lot of promise in boot so far. A bit like when I first discovered gradle after tearning my hair out trying to tweak maven to do something beyond what maven thinks a build default should do. I think there are some good ideas/inspirational areas to be cherry-picked from gradle (like incremental tasks, parallell task execution, the gradle wrapper etc).

I'm digressing. I'm not in a hurry, I'll keep running lein for a while longer but will try to keep a parallel boot build functional. For boot the tool to get higher adoption in particular for clojure ring projects, I guess some dramatic performance enhancements needs to be implemented though. If I am able to contribute (either with code or testing if that helps) I'm happy to do so.

@micha
Copy link
Contributor

micha commented Nov 9, 2015

@rundis it would be great to have a reference project to measure progress with this. I've been working on uberjar performance a lot lately, things are getting much better.

@micha
Copy link
Contributor

micha commented Nov 14, 2015

I think the worst of the uber task performance issues have been resolved.

See: https://github.com/micha/boot-uberjar-perf

@micha micha closed this as completed Nov 14, 2015
@xiongtx
Copy link

xiongtx commented Jun 19, 2017

Is the performance issue really resolved? What commit resolved it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants