The raw data from the babashka survey held in November 2020 is now available here.
We had about 100 respondents (just below the free tier of SurveyMonkey). Next time I'll probably switch to Google forms and add an optional contact field, since I would have liked to go deeper into some of the answers.
Here are the ten questions from the survey followed by the summarized outcome and my commentary.
Thank you all for taking the time. This provides useful input for the future development of babashka.
Outcome: 75% work, 76% hobby
My comment: Most people use it both at work and for their hobby programming. This is more than I could have hoped for when I started babashka.
Outcome:
- clojure.java.shell *******************************
- cheshire.core (json) ****************************
- babashka.curl ************************
- clojure.java.io *********************
- clojure.string ******************
- babashka.process **************
- clojure.edn *******
- clojure.tools.cli *****
- http (in general) ******
- clojure.data.xml ****
- clojure.data.csv ****
- clj-yaml.core ****
- dates / java.time ****
- nREPL server ****
- sql / jdbc ****
- org.httpkit.client ***
- i/o flags ***
- clojure.pprint **
- org.httpkit.server *
- babashka.wait *
- multi-file project *
- docopt *
- ProcessBuilder *
- java.util *
- transit *
- core.async *
- clojure.data *
- aero *
- clojure.zip *
My comment: Shelling out is a popular thing to do in babashka:
clojure.java.shell
is the most widely used namespaces. The babashka.process
namespace is already used quite a lot despite its recent appearance.
Other popular ways to use babashka:
- http (
babashka.curl
or other) - JSON (cheshire)
- file interaction (
clojure.java.io
) - string manipulation (
clojure.string
)
Nothing surprising there for a scripting tool. Closely following is tools.cli
,
and several other data formats (edn, csv, xml, yaml). The transit format isn't
that widely used from babashka.
Some people are using babashka for interacting with SQL databases. There are two ways to do this: compile babashka with extra feature flags enabled or use babashka-sql-pods.
According to the above outcome, babaska isn't that widely used for small web
applications yet. The core.async
library had only had one mention although
nobody mentioned it as redundant in question 5.
Outcome:
- sql (mostly postgres) ***************
- file watcher ****
- clj-kondo ***
- kafka **
- tabl **
- tzzh-aws **
- etaoin **
- brisk *
- lanterna *
- tzzh-mail *
- bootleg *
- parcera *
My comment:
The sql and filewatcher pods are the most popular, closely followed by the clj-kondo pod.
The pod concept isn't widely understood yet. In 2021 I might give a talk, if the opportunity arises, on babashka pod usage and how to develop them. Pods are an extra thing to install which may be a barrier for adoption for some. This ties into the next question.
Outcome:
- easy usage of libs and pods ***********
- java.nio bindings / fs lib ****
- clojure.spec (+ gen) ****
- database / jdbc ***
- raspberry pi support ***
- hiccup **
- compile script to native **
- spire as pod *
- docs and examples *
- packaging *
- ssh *
- nrepl *
- http server *
- easier creation of pods *
- sqlite *
- java-time lib *
- better REPL *
My comment:
Clearly users want an easier way to include libraries and pods. We are thinking about that in this issue.
The second most mentioned missing feature was a library around files. This is work in progress here.
Most people did not suggest anything should be removed, maybe also related to the most given answer to Q9. A few people expressed confusion about multiple ways of making http requests.
My comment: If have attempted to write about HTTP request in babashka
here and
here.
The summary is that both babashka.curl
and org.httpkit.client
have different
optimal use cases. In most small scripting scenarios babashka.curl
will
do. For making many small requests org.httpkit.client
is more optimal since it
won't create an OS process. For downloading big files, babashka.curl
is more
optimal, since there is no way to prevent org.httpkit.client
from holding the
response into memory all at once.
In the future both of these clients may be superseded by a client based on
java.net.http
(probably exposed as babashka.http-client
), but this client
isn't there yet and it may take a while before the Java 11 client space has
crystallized. Meanwhile you will have to do it with either of these
clients. Luckily both clients accept and return Ring-like maps, so upgrading to
the future client should not be hard, unless you are depending on something very
specific. I plan on keeping babashka.curl
in babashka no matter what, since
it's a very thin layer over curl
and doesn't complicate the compilation
process of babashka much. There will be a transition period of having both
org.httpkit.client
and the java.net.http
client after which
org.httpkit.client
will be phased out and babashka.http-client
will be the
recommended HTTP client in babashka.
Outcome:
- Yes: 7.8%
- No: 69%
- It depends: 23%
My comment:
The most given answer is: no, keep it this way. Most of the "it depends" answers said: I don't really care. Some suggested that printing should always be explicit and maybe should be controlled using a flag.
The reason babashka always prints the list value of a script, unless it is
nil
, is that in the very beginning, babashka was intended as a Clojure-in-bash
one-liner utility. We only had the -e
option and not the -f
option for
executing scripts from disk. When the -f
option got added, it was handled
exactly the same as -e
, with the difference that the expression was read from
a file.
Since the majority of people expressed that this should not change, we will keep
it this way. This will also avoid breakage for scripts that depend on
this. Stylistically it might be better to use explicit printing in scripts. If a
script returns nil
, nothing will get printed for you, which aligns nicely with
how println
and prn
behave.
Outcome:
The majority of people use babashka as a replacement for bash
, python
or
jq
. The fast startup is the main reason they favor it over the JVM when
writing small and short-running scripts that are cross-platform. In this
scenario babashka is used for CLI utilities, ETL (data processing) scripts,
automation / CI / devops and scripts ran from editors like emacs and vim. Some
people use it in places where they want to avoid or can't have a JVM
installation or in memory constrained environments. Some people answered:
whenever I can.
My comment:
Babashka's original goal was to be a Clojure replacement for shell scripting and based on the answers, it seems to have pulled that off.
Outcome:
- Linux: 80%
- macOS: 58%
- Windows: 16%
- Other: 2% (raspberry if possible, AWS Lambda)
My comment:
It's interesting to compare the outcome of this question with the answers to "what is your primary development OS" in the Clojure 2020 survey here.
Note that I didn't ask about the development OS but where scripts are run and multiple answers were allowed.
Outcome:
- Yes: 13%
- No: 87%
My comment:
The people who care about binary size report that it should not grow beyond the range of 10mb and 512mb. But most people don't care about it.
I do personally care about it from the perspective of not cluttering babashka unnecessarily during its lifetime. Related, adding more stuff challenges GraalVM compile time and memory usage during compilation. Hitting the limit on free CI might mean I will have to charge for building binaries in the future or come up with some business model to finance the builds.
Outcome:
Lots of encouraging words. Thank you all!