Benchmarks

Resource Consumption

Cernan is intended to be a good citizen. It consumes three major resources:

CPU
disk space / IO
memory

This page will take about each in turn.

CPU

Cernan comes with a set of benchmarks and we, in development, track these closely. On my system the source parsing benchmarks look like so:

test bench_graphite                      ... bench:         165 ns/iter (+/- 23)
test bench_statsd_counter_no_sample      ... bench:         159 ns/iter (+/- 19)
test bench_statsd_counter_with_sample    ... bench:         195 ns/iter (+/- 89)
test bench_statsd_gauge_mit_sample       ... bench:         166 ns/iter (+/- 125)
test bench_statsd_gauge_no_sample        ... bench:         156 ns/iter (+/- 21)
test bench_statsd_histogram              ... bench:         158 ns/iter (+/- 26)
test bench_statsd_incr_gauge_no_sample   ... bench:         164 ns/iter (+/- 24)
test bench_statsd_incr_gauge_with_sample ... bench:         172 ns/iter (+/- 48)
test bench_statsd_timer                  ... bench:         161 ns/iter (+/- 21)

That is, cernan is able to parse approximately 2,000,000 points per second on my system. The mpsc round-trip benchmark:

test bench_snd_rcv                          ... bench:       2,370 ns/iter (+/- 380)

suggests that we're able to clock around 500,000 points from source, to disk and then out to sink. Experimentation with the null sink--below--bears this out. We encourage you to run these for yourself on your own system. You'll need a nightly compiler to run them--for now--but once you've got the nightly compiler cargo bench will get you where you want to be.

If you'd like to run these benchmarks yourself--and we invite you to!--please execute

> cargo bench

Disk IO

Cernan's disk consumption is proportional to the number of telemetry points added into the system multiplied by the number of enabled sinks, complicated by the speed of said sinks. That is, for each telemetry point that comes into cernan we make N duplicates of it in its parsed form, where N is the number of enabled sinks. If a sink is especially slow--as the firehose sink can be--then more points will pool up in the disk-queue. A fast sink--like null will keep only a minimal number of points on disk. At present, this is 100MB worth.

Memory

Cernan's allocation patterns are tightly controlled. By flushing to disk we reduce the need for especially fancy tricks and sustain max allocation of a few megabytes on our heavily loaded systems. Cernan is vulnerable to randomized attacks--that is, attacks where randomized metrics names are shipped--and it may allocate tens of megabytes while sustaining such an attack. If anyone has suggestions for systematically benchmarking memory use we'd be all for it.

Demonstrating Fitness for Purpose

The cernan project ships with a suit of micro-benchmarks but demonstrating the ability to sustain load on a project machine is vital for serious use. To that end we've built evans, a tool to generate random load across cernan's ingestion interfaces at a given hertz. Please see that project's documentation for details.

A Postmates Project
Tech Blog | Twitter @PostmatesDev | Jobs

Home
Quickstart
Configuration
Data Model
Sources
- Avro
- Files
- Internal
- Graphite
- Native
- Statsd
Sinks
- Amazon Firehose
- Console
- ElasticSearch
- InfluxDB
- Kafka
- Kinesis
- Native
- Null
- Prometheus
- Wavefront
Filters
Benchmarks
Glossary
Acknowledgements

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Benchmarks

Resource Consumption

CPU

Disk IO

Memory

Demonstrating Fitness for Purpose

Clone this wiki locally