Skip to content

jamesdbrock/replace-benchmark

Repository files navigation

replace-benchmark

Benchmarks for replace-megaparsec and replace-attoparsec.

Usage

To run the benchmarks,

nix run github:jamesdbrock/replace-benchmark

Method

These benchmarks are intended to measure the wall-clock speed of everything except the actual pattern-matching. Speed of the pattern-matching is the responsibility of the megaparsec and attoparsec libraries.

The benchmark task is to find all of the one-character patterns x in a text stream and replace them by a function which returns the constant string oo. So, like the regex s/x/oo/g.

We have two benchmark input cases, which we call dense and sparse.

The dense case is ten megabytes of alternating spaces and xs like

x x x x x x x x x x x x x x x x x x x x x x x x x x x x

The sparse case is ten megabytes of spaces with a single x in the middle like

                         x

Each benchmark program reads the input from stdin, replaces x with oo, and writes the result to stdout. The time elapsed is measured in milliseconds by perf stat, and the best observed time is recorded.

Results

In milliseconds. Smaller is better.

Function replacement

Here is a comparison of replacement methods which can use an arbitrary function to calculate the replacement string.

Program dense ms sparse ms
Python 3.10.10 re.sub repl function 570.42 36.24
Perl v5.36.0 s///ge function 1248.59 13.33
Replace.Megaparsec.streamEdit String 3099.49 3020.84
Replace.Megaparsec.streamEdit ByteString 3930.55 774.05
Replace.Megaparsec.streamEdit Text 4104.92 916.76
Replace.Attoparsec.ByteString.streamEdit 3206.51 181.80
Replace.Attoparsec.Text.streamEdit 3229.30 310.53
Replace.Attoparsec.Text.Lazy.streamEdit 3252.55 251.83
Text.Regex.Applicative.replace String 14366.05 4633.66
Text.Regex.PCRE.Heavy.gsub Text 119.02
Control.Lens.Regex.ByteString.match 119.56
Control.Lens.Regex.Text.match 36.91

Constant replacement

For reference, here is a comparison of replacement methods which can only replace with a constant string or a templated string.

Program dense ms sparse ms
GNU sed 4.9 432.55 21.18
Python 3.10.10 re.sub repl string 267.28 36.37
Perl v5.36.0 s///g 220.68 11.89
Data.ByteString.Search.replace 848.27 11.19
Data.Text.replace 549.26 99.93

About

Benchmarks for replace-megaparsec and replace-attoparsec

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages