Skip to content

Commit

Permalink
Merge pull request #13 from twiby/performance_evaluation
Browse files Browse the repository at this point in the history
add a robust performance evaluation system
  • Loading branch information
twiby authored Aug 14, 2024
2 parents 6743c3f + 23e55f6 commit 80a37e5
Show file tree
Hide file tree
Showing 21 changed files with 617 additions and 457 deletions.
6 changes: 0 additions & 6 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -26,11 +26,5 @@ version = "0.8"
optional = true

[dev-dependencies]
criterion = "0.5"
num-bigint = "0.4"
typed_test_gen = "0.1"

[[bench]]
name = "biguint"
harness = false
required-features = ["rand"]
38 changes: 17 additions & 21 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,33 +19,23 @@ the following:
```bash
cargo build
cargo docs
cargo bench --features=rand
cargo test
```

For benchmarks specifically, you might want to call only some of these:
```bash
cargo bench mul --features=rand
cargo bench add --features=rand
cargo bench sub --features=rand
```

Benchmarks won't compile/run without the `rand` feature enabled.
For benchmarks, please visit the `benches` folder.

# Performance
The ambitious and naive goal of this project is to be as performant or better than
any state of the art crate on these methods.
More details and scripts about performance are available in the `benches`
folder.

I choose to compare myself to `num-bigint` first, as it's quite standard at this
point.

Today, on x86, `twibint` is faster than `num-bigint` v0.4 for addition, above
around 10000 bits. It is on par for multiplication, starting 1000 bits.
TL;DR -> The current state of `twibint`s performance (v0.2.7) is: Addition,
Subtraction and Multiplication are faster than for Python integers, and faster
then `num-bigint` at some scales. Division remains extremely slow.

# List of features

- `rand`: enables the possibility to generate a random integer with a specific
number of bits. Uses `rand` crate as a dependency.
- `rand`: exports the function `gen_random_biguint`: enables the possibility to generate
a random integer with a specific number of bits. Uses `rand` crate as a dependency.
- `pyo3`: Only used to generate python bindings, it's only meant to be used
indirectly via the `pip install .` command. Uses `pyo3` crate as a dependency.
- `unsafe`: Enables accelerations that use unsafe Rust. Enabled by default.
Expand All @@ -62,14 +52,20 @@ This crate seems faster than the default Python integers for addition and multip
above a certain numbers of bits (between 1000 and 10000 bits).

Python tests are available to be run in the `pytest` framework. They are located
in the `tests` folder and should provide ample example usage.
in the `tests` folder and should provide ample example usage. Run the tests with
```python
pytest tests
```

Performance comparison with Python's default integers are available in the
`benches` folder.


# Changelog for version 0.2
This new version contains extensive accelerations for addition, subtraction, and
multiplication on x86_64 machines. I used no modern extensions of x86, so these
acceleration should be portable accross this family of mahcines. These will probably also
have performance repercussions on many other features.
acceleration should be portable accross this family of machines. These
will probably also have performance repercussions on many other features.

These acceleration are mostly due to dropping inline assembly for core loops, and are
based on `unsafe` Rust. Other `unsafe` features used include smartly swapping between
Expand Down
28 changes: 28 additions & 0 deletions benches/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
# Performance of `twibint`

## comparing to Python integers
A very simple and naive scripts helps evaluate how the performance compares to
Python integers: `benches.py`.

Simply put, `twibint`'s addition and multiplication are faster, but division
is slower. I've not produced a systematic study comparing the 2 option at
different scales at this point.

## comparing to `num-bigint`
Running the Python script `run_benchmarks.py` will run a series of benchmarks
for several operations at different scales, and produce figures to compare
performance between `twibint` and `num-bigint`. In the future, I'd like these
benchmarks to include more crates.

For each operation (add, sub, mul, div), we generate a pair of random integer
that have at least a certain size (every bit is random except the most
significant one, to ensure they always have the same size). We measure the
non-assign version of the operation (we never use mul_assign or add_assign
for example). Sometimes we also measure an "asymetric" version of a binary
operation, where one operand is around 3 times bigger than the other.

![alt text](plots/sub.png "Subtraction")
![alt text](plots/add.png "Addition")
![alt text](plots/div.png "Division")
![alt text](plots/mul.png "Multiplication")
![alt text](plots/asymetric_mul.png "Asymetric multiplication")
24 changes: 24 additions & 0 deletions benches/bencher/Cargo.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
[package]
name = "bencher"
version = "0.0.0"
edition = "2021"

[dependencies]
rand = "0.8"

[dependencies.twibint]
path = "../../"
features = ["rand"]
optional = true

[dependencies.num-bigint]
version = "0.4"
features = ["rand"]
optional = true

[dev-dependencies]
criterion = "0.5"

[[bench]]
name = "biguint"
harness = false
135 changes: 135 additions & 0 deletions benches/bencher/benches/biguint.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,135 @@
use bencher::gen_random_biguint;
use bencher::GetNbBits;

use criterion::{black_box, criterion_group, criterion_main, Criterion};

pub fn add<const N: usize>(c: &mut Criterion) {
let n1 = gen_random_biguint(N);
let n2 = gen_random_biguint(N);

let mut name = "add ".to_string();
name.push_str(&n1.get_nb_bits().to_string());
name.push('+');
name.push_str(&n2.get_nb_bits().to_string());

c.bench_function(name.as_str(), |b| b.iter(|| black_box(&n1 + &n2)));
}

criterion_group!(
biguint_add,
add<1_000>,
add<3_000>,
add<10_000>,
add<30_000>,
add<100_000>,
add<300_000>,
add<1_000_000>,
add<3_000_000>,
add<10_000_000>,
add<30_000_000>,
add<100_000_000>,
);

pub fn sub<const N: usize>(c: &mut Criterion) {
let n2 = gen_random_biguint(N);
let n1 = &n2 + u64::MAX;

let mut name = "sub ".to_string();
name.push_str(&n1.get_nb_bits().to_string());
name.push('-');
name.push_str(&n2.get_nb_bits().to_string());

c.bench_function(name.as_str(), |b| b.iter(|| black_box(&n1 - &n2)));
}

criterion_group!(
biguint_sub,
sub<1_000>,
sub<3_000>,
sub<10_000>,
sub<30_000>,
sub<100_000>,
sub<300_000>,
sub<1_000_000>,
sub<3_000_000>,
sub<10_000_000>,
sub<30_000_000>,
sub<100_000_000>,
);

pub fn mul<const N: usize>(c: &mut Criterion) {
let n1 = gen_random_biguint(N);
let n2 = gen_random_biguint(N);

let mut name = "mul ".to_string();
name.push_str(&n2.get_nb_bits().to_string());
name.push('x');
name.push_str(&n1.get_nb_bits().to_string());

c.bench_function(name.as_str(), |b| b.iter(|| black_box(&n1 * &n2)));
}

criterion_group!(
biguint_mul,
mul<30>,
mul<100>,
mul<300>,
mul<1_000>,
mul<3_000>,
mul<10_000>,
mul<30_000>,
);

pub fn asymetric_mul<const N: usize, const N2: usize>(c: &mut Criterion) {
let n1 = gen_random_biguint(N);
let n2 = gen_random_biguint(N2);

let mut name = "asymetric_mul ".to_string();
name.push_str(&n2.get_nb_bits().to_string());
name.push('x');
name.push_str(&n1.get_nb_bits().to_string());

c.bench_function(name.as_str(), |b| b.iter(|| black_box(&n1 * &n2)));
}

criterion_group!(
biguint_asymetric_mul,
asymetric_mul<30, 3>,
asymetric_mul<100, 9>,
asymetric_mul<300, 27>,
asymetric_mul<1_000, 92>,
asymetric_mul<3_000, 287>,
asymetric_mul<10_000, 1001>,
asymetric_mul<30_000, 3027>,
);

pub fn div<const N: usize, const N2: usize>(c: &mut Criterion) {
let n1 = gen_random_biguint(N);
let n2 = gen_random_biguint(N2);

let mut name = "div ".to_string();
name.push_str(&n1.get_nb_bits().to_string());
name.push('/');
name.push_str(&n2.get_nb_bits().to_string());

c.bench_function(name.as_str(), |b| b.iter(|| black_box(&n1 / &n2)));
}

criterion_group!(
biguint_div,
div<30, 3>,
div<100, 9>,
div<300, 27>,
div<1_000, 92>,
div<3_000, 287>,
div<10_000, 1001>,
div<30_000, 3027>,
);

criterion_main!(
biguint_add,
biguint_sub,
biguint_mul,
biguint_asymetric_mul,
biguint_div
);
56 changes: 56 additions & 0 deletions benches/bencher/src/lib.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
#![allow(unused_imports)]
use rand::distributions::Standard;
use rand::prelude::*;

#[cfg(not(any(feature = "twibint", feature = "num-bigint")))]
compile_error!("Exactly one feature must be used");

#[cfg(all(feature = "twibint", feature = "num-bigint"))]
compile_error!("Exactly one feature must be used");

#[cfg(feature = "twibint")]
pub fn gen_random_biguint(n: usize) -> twibint::BigUint<u64> {
let ret = twibint::gen_random_biguint::<u64>(n);
assert_eq!(ret.nb_bits(), n);
ret
}

#[cfg(feature = "num-bigint")]
pub fn gen_random_biguint(n: usize) -> num_bigint::BigUint {
use num_bigint::RandBigInt;

let n: u64 = n.try_into().unwrap();

let mut rng = rand::thread_rng();
let mut ret = rng.gen_biguint(n);
let nb_bits = ret.bits();

if nb_bits == 0 {
ret = num_bigint::BigUint::from(1u32) << (n - 1);
} else if nb_bits > n {
ret >>= nb_bits - n;
} else if nb_bits < n {
ret <<= n - nb_bits;
}

assert_eq!(ret.bits(), n);
ret
}

pub trait GetNbBits {
fn get_nb_bits(&self) -> usize;
}

#[cfg(feature = "twibint")]
impl GetNbBits for twibint::BigUint<u64> {
fn get_nb_bits(&self) -> usize {
self.nb_bits()
}
}

#[cfg(feature = "num-bigint")]
impl GetNbBits for num_bigint::BigUint {
fn get_nb_bits(&self) -> usize {
self.bits().try_into().unwrap()
}
}
Loading

0 comments on commit 80a37e5

Please sign in to comment.