-
Notifications
You must be signed in to change notification settings - Fork 3
Speed Comparisons
NOTE: As Loopy develops, these examples and results might become out of date. For the most up-to-date information, refer to this package’s documentation, which is installed with the package as the Info file loopy
.
Additionally, this information can/should probably be expanded to compare to several other approaches.
This file contains benchmarks comparing cl-loop
, loopy
, some of the built-in
C functions like mapcar
and mapcan
, some of CL Lib’s functions like
cl-mapcar
and cl-mapcan
, and some of Dash’s functions like -map
and
-mapcat
.
In general, for simple cases, they can be sorted from fastest to slowest as
- C functions
- Dash’s functions (usually)
-
loopy
andcl-loop
.
The C functions are fastest because they are in C, and Dash’s functions are
faster because they try to be pure and side-effect free (which allows for better
optimizations). loopy
tries to follow the same logic as cl-loop
, so they are
usually about the same speed.
loopy
and cl-loop
work best for more procedural (as opposed to functional)
approaches, but come with some overhead for being good at that case. If you’re
working with simple functions, it might be best to use the built-in functions or
Dash. If working with complex functions or special forms like dolist
or
dotimes
, your code might be easier to write with loopy
or cl-loop
.
loopy
tries to follow the same logic as cl-loop
, but in being more
flexible in some cases, it cannot always implement concepts as efficiently as
cl-loop
. However, for general cases, they should be more or less the same.
These comparisons are made using macros listed in user Alphapapa’s useful Emacs Package Dev Handbook. For completeness, they are copied below.
(require 'cl-lib)
;;;###autoload
(cl-defmacro bench (&optional (times 100000) &rest body)
"Call `benchmark-run-compiled' on BODY with TIMES iterations, returning list suitable for Org source block evaluation.
Garbage is collected before calling `benchmark-run-compiled' to
avoid counting existing garbage which needs collection."
(declare (indent defun))
`(progn
(garbage-collect)
(list '("Total runtime" "# of GCs" "Total GC runtime")
'hline
(benchmark-run-compiled ,times
(progn
,@body)))))
;; TODO: Consider not using `-on' here to avoid the extra dependency.
(require 'dash-functional)
;;;###autoload
(cl-defmacro bench-multi (&key (times 1) forms ensure-equal raw)
"Return Org table as a list with benchmark results for FORMS.
Runs FORMS with `benchmark-run-compiled' for TIMES iterations.
When ENSURE-EQUAL is non-nil, the results of FORMS are compared,
and an error is raised if they aren't `equal'. If the results are
sequences, the difference between them is shown with
`seq-difference'.
When RAW is non-nil, the raw results from
`benchmark-run-compiled' are returned instead of an Org table
list.
If the first element of a form is a string, it's used as the
form's description in the bench-multi-results; otherwise, forms
are numbered from 0.
Before each form is run, `garbage-collect' is called."
;; MAYBE: Since `bench-multi-lexical' byte-compiles the file, I'm not sure if
;; `benchmark-run-compiled' is necessary over `benchmark-run', or if it matters.
(declare (indent defun))
(let*((keys (gensym "keys"))
(result-times (gensym "result-times"))
(header '(("Form" "x faster than next" "Total runtime" "# of GCs" "Total GC runtime")
hline))
;; Copy forms so that a subsequent call of the macro will get the original forms.
(forms (cl-copy-list forms))
(descriptions (cl-loop for form in forms
for i from 0
collect (if (stringp (car form))
(prog1 (car form)
(setf (nth i forms) (cadr (nth i forms))))
i))))
`(unwind-protect
(progn
(defvar bench-multi-results nil)
(let* ((bench-multi-results (make-hash-table))
(,result-times (sort (list ,@(cl-loop for form in forms
for i from 0
for description = (nth i descriptions)
collect `(progn
(garbage-collect)
(cons ,description
(benchmark-run-compiled ,times
,(if ensure-equal
`(puthash ,description ,form bench-multi-results)
form))))))
(lambda (a b)
(< (cl-second a) (cl-second b))))))
,(when ensure-equal
`(cl-loop with ,keys = (hash-table-keys bench-multi-results)
for i from 0 to (- (length ,keys) 2)
unless (equal (gethash (nth i ,keys) bench-multi-results)
(gethash (nth (1+ i) ,keys) bench-multi-results))
do (if (sequencep (gethash (car (hash-table-keys bench-multi-results)) bench-multi-results))
(let* ((k1) (k2)
;; If the difference in one order is nil, try in other order.
(difference (or (setq k1 (nth i ,keys)
k2 (nth (1+ i) ,keys)
difference (seq-difference (gethash k1 bench-multi-results)
(gethash k2 bench-multi-results)))
(setq k1 (nth (1+ i) ,keys)
k2 (nth i ,keys)
difference (seq-difference (gethash k1 bench-multi-results)
(gethash k2 bench-multi-results))))))
(user-error "Forms' bench-multi-results not equal: difference (%s - %s): %S"
k1 k2 difference))
;; Not a sequence
(user-error "Forms' bench-multi-results not equal: %s:%S %s:%S"
(nth i ,keys) (nth (1+ i) ,keys)
(gethash (nth i ,keys) bench-multi-results)
(gethash (nth (1+ i) ,keys) bench-multi-results)))))
;; Add factors to times and return table
(if ,raw
,result-times
(append ',header
(bench-multi-process-results ,result-times)))))
(unintern 'bench-multi-results nil))))
(defun bench-multi-process-results (results)
"Return sorted RESULTS with factors added."
(setq results (sort results (-on #'< #'cl-second)))
(cl-loop with length = (length results)
for i from 0 to (1- length)
for description = (car (nth i results))
for factor = (if (< i (1- length))
(format "%.2f" (/ (cl-second (nth (1+ i) results))
(cl-second (nth i results))))
"slowest")
collect (append (list description factor)
(list (format "%.6f" (cl-second (nth i results)))
(cl-third (nth i results))
(if (> (cl-fourth (nth i results)) 0)
(format "%.6f" (cl-fourth (nth i results)))
0)))))
The following clauses don’t have exact equivalents in Loopy, but most can be
mimicked by using Emacs’s built-in functions and the list
loop command. See
the section For Clauses under Translating … for the relevant function.
for VAR from EXPR1 to EXPR2 by EXPR3
for VAR being the symbols
for VAR being the hash-keys
for VAR being the hash-values
for VAR being the key-codes
for VAR being the key-bindings
for VAR being the key-bindings
for VAR being the key-seqs
for VAR being the overlays
for VAR being the intervals
for VAR being the frames
for VAR being the windows
for VAR being the buffers
-
for VAR in LIST by FUNCTION
:(let ((l (number-sequence 1 100))) (bench-multi :times 10000 :forms (("cl-loop" (cl-loop for i in l)) ("loopy" (loopy ((list i l)))) ;; These shouldn’t be different. ("cl-loop with function" (cl-loop for i in l by #'cdr)) ("loopy with function" (loopy ((list i l #'cdr)))))))
Form x faster than next Total runtime # of GCs Total GC runtime cl-loop 1.04 0.027221 0 0 loopy 1.00 0.028386 0 0 cl-loop with function 1.00 0.028422 0 0 loopy with function slowest 0.028553 0 0 -
for VAR on LIST by FUNCTION
:(let ((l (number-sequence 1 100))) (bench-multi :times 10000 :forms (("cl-loop" (cl-loop for i on l)) ("loopy" (loopy ((cons i l)))) ;; These shouldn’t be different. ("cl-loop with function" (cl-loop for i on l by #'cdr)) ("loopy with function" (loopy ((cons i l #'cdr)))))))
Form x faster than next Total runtime # of GCs Total GC runtime loopy 1.02 0.017559 0 0 loopy with function 1.00 0.017893 0 0 cl-loop with function 1.01 0.017907 0 0 cl-loop slowest 0.018028 0 0 -
for VAR in-ref LIST by FUNCTION
:(let ((l (number-sequence 1 100))) (bench-multi :times 10000 :forms (("cl-loop" (cl-loop for i in-ref l)) ("loopy" (loopy ((list-ref i l)))) ;; These shouldn’t be different. ("cl-loop with function" (cl-loop for i in-ref l by #'cdr)) ("loopy with function" (loopy ((list-ref i l #'cdr)))))))
Form x faster than next Total runtime # of GCs Total GC runtime cl-loop 1.00 0.017619 0 0 loopy 1.04 0.017678 0 0 loopy with function 1.01 0.018400 0 0 cl-loop with function slowest 0.018612 0 0 -
for VAR across ARRAY
:(let ((a (make-vector 1000 1))) (bench-multi :times 10000 :forms (("cl-loop" (cl-loop for i across a)) ("loopy" (loopy ((array i a)))))))
Form x faster than next Total runtime # of GCs Total GC runtime loopy 1.08 0.500913 0 0 cl-loop slowest 0.541114 0 0 -
for VAR across ARRAY
:(let ((a (make-vector 1000 1))) (bench-multi :times 10000 :forms (("cl-loop" (cl-loop for i across-ref a)) ("loopy" (loopy ((array-ref i a)))))))
Form x faster than next Total runtime # of GCs Total GC runtime loopy 1.16 0.342828 0 0 cl-loop slowest 0.397609 0 0 -
for VAR being the elements of SEQUENCE
:(let ((a (make-vector 1000 1)) (l (make-list 1000 1))) (bench-multi :times 10000 :forms (("cl-loop array" (cl-loop for i being the elements of a)) ("loopy array" (loopy ((seq i a)))) ("cl-loop list" (cl-loop for i being the elements of l)) ("loopy list" (loopy ((seq i l)))))))
Form x faster than next Total runtime # of GCs Total GC runtime cl-loop list 1.01 0.667158 0 0 loopy list 1.25 0.674926 0 0 loopy array 1.01 0.845188 0 0 cl-loop array slowest 0.852027 0 0 -
for VAR being the elements of-ref SEQUENCE
:(let ((a (make-vector 1000 1)) (l (make-list 1000 1))) (bench-multi :times 10000 :forms (("cl-loop array" (cl-loop for i being the elements of-ref a)) ("loopy array" (loopy ((seq-ref i a)))) ("cl-loop list" (cl-loop for i being the elements of-ref l)) ("loopy list" (loopy ((seq-ref i l)))))))
Form x faster than next Total runtime # of GCs Total GC runtime loopy array 1.00 0.291586 0 0 cl-loop array 1.04 0.292248 0 0 cl-loop list 1.02 0.302581 0 0 loopy list slowest 0.309345 0 0 -
for VAR = EXPR1 then EXPR2
:(bench-multi :times 10000 :forms (("cl-loop" (cl-loop for i = 0 then (1+ i) repeat 100)) ("loopy" (loopy ((expr i 0 (1+ i)) (repeat 100))))))
Form x faster than next Total runtime # of GCs Total GC runtime cl-loop 1.02 0.005220 0 0 loopy slowest 0.005346 0 0
The clauses always
, never
, thereis
, and iter-by
are currently not
implemented in Loopy, but their usages can be matched with existing
(though currently less convenient) loop commands.
-
repeat
:(bench-multi :times 1000 :forms (("cl-loop" (cl-loop repeat 100)) ; Uses `>=' and `1-'. ("loopy" (loopy ((repeat 100)))))) ; Uses `<' and `1+'.
Form x faster than next Total runtime # of GCs Total GC runtime cl-loop 1.03 0.002867 0 0 loopy slowest 0.002954 0 0 -
while
:while
conditions inloopy
are checked separately from the loops normal continuance conditions in anif
form. Incl-loop
, they in the same code section that sets variables in anand
form. This is probably the cause of the speed difference.(bench-multi :times 10000 :forms (("cl-loop v1" (cl-loop while t repeat 100)) ("loopy v1" (loopy ((while t) (repeat 100)))) ;; Try opposite order. ("cl-loop v2" (cl-loop repeat 100 while t)) ("loopy v2" (loopy ((repeat 100) (while t))))))
Form x faster than next Total runtime # of GCs Total GC runtime cl-loop v2 1.05 0.028047 0 0 cl-loop v1 1.11 0.029580 0 0 loopy v2 1.03 0.032949 0 0 loopy v1 slowest 0.034065 0 0 -
until
:(bench-multi :times 10000 :forms (("cl-loop v1" (cl-loop until nil repeat 100)) ("loopy v1" (loopy ((until nil) (repeat 100)))) ;; Try opposite order. ("cl-loop v2" (cl-loop repeat 100 until nil)) ("loopy v2" (loopy ((repeat 100) (until nil))))))
Form x faster than next Total runtime # of GCs Total GC runtime cl-loop v2 1.02 0.027976 0 0 cl-loop v1 1.14 0.028576 0 0 loopy v2 1.01 0.032519 0 0 loopy v1 slowest 0.032985 0 0
cl-loop
uses special considerations for append
, collect
, and nconc
using combinations append
, nconc
, push
, nreverse
, and reverrse
depending on how variables are being declared and used. loopy
is currently
much simpler, such as using push
and nreverse
for collect
when no
explicitly variable is given (as in (collect some-val)
) and using append
otherwise.
You should note that naming the variables into which you append
,
collect
, or nconc
can be much slower, since the macros can no longer use
tricks with reverse
and nreverse
. If they did, you might try to access
the accumulation variable before it was reversed.
-
collect
:(let ((l (number-sequence 1 1000))) (bench-multi :times 10000 :forms (("cl-loop" (cl-loop for i in l collect (* (1+ i) 2))) ("loopy" (loopy ((list i l) (collect (* (1+ i) 2))))) ("mapcar" (mapcar (lambda (i) (* (1+ i) 2)) l)) ("cl-mapcar" (cl-mapcar (lambda (i) (* (1+ i) 2)) l)) ("dash" (-map (lambda (i) (* (1+ i) 2)) l)) ("anaphoric dash" (--map (* (1+ it) 2) l)) ("dolist" (let ((result)) (dolist (i l (nreverse result)) (push (* (1+ i) 2) result)))))))
Form x faster than next Total runtime # of GCs Total GC runtime dolist 1.27 0.669873 0 0 cl-loop 1.05 0.849984 0 0 loopy 1.09 0.896564 0 0 mapcar 1.02 0.974399 0 0 cl-mapcar 1.01 0.989138 0 0 anaphoric dash 1.43 0.995825 0 0 dash slowest 1.425671 0 0 -
append
andnconc
:(let ((l (number-sequence 1 1000))) (bench-multi :times 10000 :forms (("cl-loop append" (cl-loop for i in l append (list (* (1+ i) 2) (* (1+ i) 3)))) ("cl-loop nconc" (cl-loop for i in l nconc (list (* (1+ i) 2) (* (1+ i) 3)))) ("loopy append" (loopy ((list i l) (append (list (* (1+ i) 2) (* (1+ i) 3)))))) ("loopy nconc" (loopy ((list i l) (nconc (list (* (1+ i) 2) (* (1+ i) 3)))))) ("mapcan" (mapcan (lambda (i) (list (* (1+ i) 2) (* (1+ i) 3))) l)) ; C function. ("cl-mapcan" (cl-mapcan (lambda (i) (list (* (1+ i) 2) (* (1+ i) 3))) l)) ("dash" (-mapcat (lambda (i) (list (* (1+ i) 2) (* (1+ i) 3))) l)) ("anaphoric dash" (--mapcat (list (list (* (1+ it) 2) (* (1+ it) 3))) l)))))
Form x faster than next Total runtime # of GCs Total GC runtime mapcan 1.01 1.554805 0 0 cl-mapcan 1.10 1.576170 0 0 cl-loop nconc 1.01 1.733220 0 0 loopy nconc 1.12 1.758132 0 0 anaphoric dash 1.02 1.966069 0 0 cl-loop append 1.01 2.014755 0 0 loopy append 2.58 2.034501 0 0 dash slowest 5.250537 1 0.432886
-
concat
:This difference might be because
cl-loop
usescl-callf
, whileloopy
currently usesconcat
andsetq
in all cases.seq-mapcat
andmapconcat
might be much faster because they only callconcat
once after mapping across all values. On the other hand,loopy
andcl-loop
currently callconcat
for each iteration. On the other hand,cl-loop
callsconcat
for each iteration, as doesloopy
if the accumulation variable is explicitly named.(let ((l (make-list 100 (make-string 100 ?a)))) (bench-multi :times 1000 :forms (("cl-loop" (cl-loop for i in l concat (capitalize i))) ;; Here, `loopy' calls `concat' once: ("loopy implicit" (loopy (list i l) (concat (capitalize i)))) ;; Here, `loopy', calls `concat' each iteration: ("loopy explicit" (loopy (list i l) (concat my-str (capitalize i)))) ("mapconcat" (mapconcat #'capitalize l "")) ; C function. ("concat" (apply #'concat (mapcar #'capitalize l))) ;; `cl-concatenate' is alias of `seq-concatenate'. ("cl-concatenate" (apply #'cl-concatenate 'string (mapcar #'capitalize l))) ("seq-macpcat" (seq-mapcat #'capitalize l 'string)))))
Form x faster than next Total runtime # of GCs Total GC runtime concat 1.01 0.235516 0 0 cl-concatenate 1.01 0.237146 0 0 seq-macpcat 1.00 0.239688 0 0 mapconcat 1.01 0.240340 0 0 loopy implicit 3.28 0.243335 0 0 loopy explicit 1.01 0.797231 0 0 cl-loop slowest 0.803007 0 0 -
vconcat
:As with
concat
above,cl-loop
callsconcat
for each iteration, while other functions create a list of sequences and then callconcat
once.(let* ((l (make-list 100 (number-sequence 1 1000)))) (bench-multi :times 1000 :forms (("cl-loop" (cl-loop for i in l vconcat (cdr i))) ("loopy implicit" (loopy (list i l) (vconcat (cdr i)))) ("loopy explicit" (loopy (list i l) (vconcat my-vect (cdr i)))) ("vconcat" (apply #'vconcat (mapcar #'cdr l))) ("cl-concatenate" (apply #'cl-concatenate 'vector (mapcar #'cdr l))) ;; This concatenates once after the `cdr' of each list is gotten. ("seq-macpcat" (seq-mapcat #'cdr l 'vector)))))
Form x faster than next Total runtime # of GCs Total GC runtime vconcat 1.00 0.425114 0 0 seq-macpcat 1.00 0.425321 0 0 cl-concatenate 1.00 0.426510 0 0 loopy implicit 61.55 0.427345 0 0 loopy explicit 1.00 26.303019 37 4.140312 cl-loop slowest 26.337261 37 4.138096 -
count
:(let* ((l (mapcar #'cl-evenp (number-sequence 1 1000)))) (bench-multi :times 10000 :forms (("cl-loop" (cl-loop for i in l count i)) ("loopy" (loopy ((list i l) (count i)))) ("cl-count" (cl-count t l)) ("cl-count-if" (cl-count-if #'identity l)) ("seq-count" (seq-count #'identity l)) ("-count" (-count #'identity l)))))
Form x faster than next Total runtime # of GCs Total GC runtime -count 1.55 0.238401 0 0 cl-loop 1.03 0.369262 0 0 loopy 1.03 0.378889 0 0 cl-count 1.65 0.389203 0 0 cl-count-if 1.19 0.641331 0 0 seq-count slowest 0.762337 0 0 -
sum
:(let* ((l (number-sequence 1 1000))) (bench-multi :times 10000 :forms (("cl-loop" (cl-loop for i in l sum (1+ i))) ("loopy" (loopy ((list i l) (sum (1+ i))))) ("mapcar and apply" (apply #'+ (mapcar #'1+ l))) ("cl-reduce and apply" (cl-reduce #'+ (mapcar #'1+ l))) ("-reduce and apply" (-reduce #'+ (mapcar #'1+ l))))))
Form x faster than next Total runtime # of GCs Total GC runtime mapcar and apply 1.31 0.432464 0 0 loopy 1.05 0.565446 0 0 cl-loop 1.18 0.591978 0 0 -reduce and apply 1.65 0.699388 0 0 cl-reduce and apply slowest 1.153148 0 0 -
maximize
:(let* ((l (number-sequence 1 1000))) (bench-multi :times 10000 :forms (("cl-loop" (cl-loop for i in l maximize (1- i))) ("loopy" (loopy ((list i l) (max (1- i))))) ("mapcar and apply" (apply #'max (mapcar #'1- l))) ("mapcar and cl-reduce" (cl-reduce #'max (mapcar #'1- l))) ("mapcar and -reduce" (-reduce #'max (mapcar #'1- l))))))
Form x faster than next Total runtime # of GCs Total GC runtime mapcar and apply 1.17 0.505320 0 0 loopy 1.29 0.591463 0 0 mapcar and -reduce 1.35 0.761307 0 0 cl-loop 1.20 1.028224 0 0 mapcar and cl-reduce slowest 1.233562 0 0 -
minimize
:(let* ((l (number-sequence 1 1000))) (bench-multi :times 10000 :forms (("cl-loop" (cl-loop for i in l minimize (1- i))) ("loopy" (loopy ((list i l) (min (1- i))))) ("mapcar and apply" (apply #'min (mapcar #'1- l))) ("mapcar and cl-reduce" (cl-reduce #'min (mapcar #'1- l))) ("mapcar and -reduce" (-reduce #'min (mapcar #'1- l))))))
Form x faster than next Total runtime # of GCs Total GC runtime mapcar and apply 1.21 0.491284 0 0 loopy 1.24 0.595057 0 0 mapcar and -reduce 1.43 0.737828 0 0 cl-loop 1.13 1.051751 0 0 mapcar and cl-reduce slowest 1.193562 0 0
(let ((alist (cl-mapcar #'cons
(number-sequence 0 1000)
(number-sequence 2000 3000))))
(bench-multi
:times 10000
:forms (("cl-loop" (cl-loop for (i . j) in alist
collect (+ i j)))
("loopy" (loopy ((list (i . j) alist)
(collect (+ i j)))))
("pcase-lambda" (mapcar (pcase-lambda (`(,i . ,j)) (+ i j)) alist))
("-lambda" (mapcar (-lambda ((i . j)) (+ i j)) alist)))))
Form | x faster than next | Total runtime | # of GCs | Total GC runtime |
---|---|---|---|---|
cl-loop | 1.12 | 0.790794 | 0 | 0 |
loopy | 2.08 | 0.882744 | 0 | 0 |
-lambda | 1.13 | 1.835495 | 0 | 0 |
pcase-lambda | slowest | 2.081928 | 0 | 0 |