Merge pull request #13 from twiby/performance_evaluation

add a robust performance evaluation system
twiby · Aug 14, 2024 · 80a37e5 · 80a37e5
2 parents 6743c3f + 23e55f6
commit 80a37e5
Show file tree

Hide file tree

Showing 21 changed files with 617 additions and 457 deletions.
diff --git a/Cargo.toml b/Cargo.toml
@@ -26,11 +26,5 @@ version = "0.8"
 optional = true
 
 [dev-dependencies]
-criterion = "0.5"
 num-bigint = "0.4"
 typed_test_gen = "0.1"
-
-[[bench]]
-name = "biguint"
-harness = false
-required-features = ["rand"]
diff --git a/README.md b/README.md
@@ -19,33 +19,23 @@ the following:
 ```bash
 cargo build
 cargo docs
-cargo bench --features=rand
 cargo test
 ```
 
-For benchmarks specifically, you might want to call only some of these:
-```bash
-cargo bench mul --features=rand
-cargo bench add --features=rand
-cargo bench sub --features=rand
-```
-
-Benchmarks won't compile/run without the `rand` feature enabled.
+For benchmarks, please visit the `benches` folder.
 
 # Performance
-The ambitious and naive goal of this project is to be as performant or better than 
-any state of the art crate on these methods.
+More details and scripts about performance are available in the `benches` 
+folder.
 
-I choose to compare myself to `num-bigint` first, as it's quite standard at this 
-point.
-
-Today, on x86, `twibint` is faster than `num-bigint` v0.4 for addition, above 
-around 10000 bits. It is on par for multiplication, starting 1000 bits. 
+TL;DR -> The current state of `twibint`s performance (v0.2.7) is: Addition, 
+Subtraction and Multiplication are faster than for Python integers, and faster 
+then `num-bigint` at some scales. Division remains extremely slow.
 
 # List of features
 
-- `rand`: enables the possibility to generate a random integer with a specific 
-number of bits. Uses `rand` crate as a dependency.
+- `rand`: exports the function `gen_random_biguint`: enables the possibility to generate 
+a random integer with a specific number of bits. Uses `rand` crate as a dependency.
 - `pyo3`: Only used to generate python bindings, it's only meant to be used
 indirectly via the `pip install .` command. Uses `pyo3` crate as a dependency.
 - `unsafe`: Enables accelerations that use unsafe Rust. Enabled by default. 
@@ -62,14 +52,20 @@ This crate seems faster than the default Python integers for addition and multip
 above a certain numbers of bits (between 1000 and 10000 bits).
 
 Python tests are available to be run in the `pytest` framework. They are located
-in the `tests` folder and should provide ample example usage.
+in the `tests` folder and should provide ample example usage. Run the tests with 
+```python
+pytest tests
+```
+
+Performance comparison with Python's default integers are available in the
+`benches` folder.
 
 
 # Changelog for version 0.2
 This new version contains extensive accelerations for addition, subtraction, and 
 multiplication on x86_64 machines. I used no modern extensions of x86, so these 
-acceleration should be portable accross this family of mahcines. These will probably also 
-have performance repercussions on many other features.
+acceleration should be portable accross this family of machines. These 
+will probably also have performance repercussions on many other features.
 
 These acceleration are mostly due to dropping inline assembly for core loops, and are 
 based on `unsafe` Rust. Other `unsafe` features used include smartly swapping between 

diff --git a/benches/README.md b/benches/README.md
@@ -0,0 +1,28 @@
+# Performance of `twibint`
+
+## comparing to Python integers
+A very simple and naive scripts helps evaluate how the performance compares to 
+Python integers: `benches.py`.
+
+Simply put, `twibint`'s addition and multiplication are faster, but division 
+is slower. I've not produced a systematic study comparing the 2 option at 
+different scales at this point.
+
+## comparing to `num-bigint`
+Running the Python script `run_benchmarks.py` will run a series of benchmarks 
+for several operations at different scales, and produce figures to compare
+performance between `twibint` and `num-bigint`. In the future, I'd like these
+benchmarks to include more crates.
+
+For each operation (add, sub, mul, div), we generate a pair of random integer
+that have at least a certain size (every bit is random except the most 
+significant one, to ensure they always have the same size). We measure the 
+non-assign version of the operation (we never use mul_assign or add_assign 
+for example). Sometimes we also measure an "asymetric" version of a binary 
+operation, where one operand is around 3 times bigger than the other.
+
+![alt text](plots/sub.png "Subtraction")
+![alt text](plots/add.png "Addition")
+![alt text](plots/div.png "Division")
+![alt text](plots/mul.png "Multiplication")
+![alt text](plots/asymetric_mul.png "Asymetric multiplication")
diff --git a/benches/bencher/Cargo.toml b/benches/bencher/Cargo.toml
@@ -0,0 +1,24 @@
+[package]
+name = "bencher"
+version = "0.0.0"
+edition = "2021"
+
+[dependencies]
+rand = "0.8"
+
+[dependencies.twibint]
+path = "../../"
+features = ["rand"]
+optional = true
+
+[dependencies.num-bigint]
+version = "0.4"
+features = ["rand"]
+optional = true
+
+[dev-dependencies]
+criterion = "0.5"
+
+[[bench]]
+name = "biguint"
+harness = false
diff --git a/benches/bencher/benches/biguint.rs b/benches/bencher/benches/biguint.rs
@@ -0,0 +1,135 @@
+use bencher::gen_random_biguint;
+use bencher::GetNbBits;
+
+use criterion::{black_box, criterion_group, criterion_main, Criterion};
+
+pub fn add<const N: usize>(c: &mut Criterion) {
+    let n1 = gen_random_biguint(N);
+    let n2 = gen_random_biguint(N);
+
+    let mut name = "add ".to_string();
+    name.push_str(&n1.get_nb_bits().to_string());
+    name.push('+');
+    name.push_str(&n2.get_nb_bits().to_string());
+
+    c.bench_function(name.as_str(), |b| b.iter(|| black_box(&n1 + &n2)));
+}
+
+criterion_group!(
+    biguint_add,
+    add<1_000>,
+    add<3_000>,
+    add<10_000>,
+    add<30_000>,
+    add<100_000>,
+    add<300_000>,
+    add<1_000_000>,
+    add<3_000_000>,
+    add<10_000_000>,
+    add<30_000_000>,
+    add<100_000_000>,
+);
+
+pub fn sub<const N: usize>(c: &mut Criterion) {
+    let n2 = gen_random_biguint(N);
+    let n1 = &n2 + u64::MAX;
+
+    let mut name = "sub ".to_string();
+    name.push_str(&n1.get_nb_bits().to_string());
+    name.push('-');
+    name.push_str(&n2.get_nb_bits().to_string());
+
+    c.bench_function(name.as_str(), |b| b.iter(|| black_box(&n1 - &n2)));
+}
+
+criterion_group!(
+    biguint_sub,
+    sub<1_000>,
+    sub<3_000>,
+    sub<10_000>,
+    sub<30_000>,
+    sub<100_000>,
+    sub<300_000>,
+    sub<1_000_000>,
+    sub<3_000_000>,
+    sub<10_000_000>,
+    sub<30_000_000>,
+    sub<100_000_000>,
+);
+
+pub fn mul<const N: usize>(c: &mut Criterion) {
+    let n1 = gen_random_biguint(N);
+    let n2 = gen_random_biguint(N);
+
+    let mut name = "mul ".to_string();
+    name.push_str(&n2.get_nb_bits().to_string());
+    name.push('x');
+    name.push_str(&n1.get_nb_bits().to_string());
+
+    c.bench_function(name.as_str(), |b| b.iter(|| black_box(&n1 * &n2)));
+}
+
+criterion_group!(
+    biguint_mul,
+    mul<30>,
+    mul<100>,
+    mul<300>,
+    mul<1_000>,
+    mul<3_000>,
+    mul<10_000>,
+    mul<30_000>,
+);
+
+pub fn asymetric_mul<const N: usize, const N2: usize>(c: &mut Criterion) {
+    let n1 = gen_random_biguint(N);
+    let n2 = gen_random_biguint(N2);
+
+    let mut name = "asymetric_mul ".to_string();
+    name.push_str(&n2.get_nb_bits().to_string());
+    name.push('x');
+    name.push_str(&n1.get_nb_bits().to_string());
+
+    c.bench_function(name.as_str(), |b| b.iter(|| black_box(&n1 * &n2)));
+}
+
+criterion_group!(
+    biguint_asymetric_mul,
+    asymetric_mul<30, 3>,
+    asymetric_mul<100, 9>,
+    asymetric_mul<300, 27>,
+    asymetric_mul<1_000, 92>,
+    asymetric_mul<3_000, 287>,
+    asymetric_mul<10_000, 1001>,
+    asymetric_mul<30_000, 3027>,
+);
+
+pub fn div<const N: usize, const N2: usize>(c: &mut Criterion) {
+    let n1 = gen_random_biguint(N);
+    let n2 = gen_random_biguint(N2);
+
+    let mut name = "div ".to_string();
+    name.push_str(&n1.get_nb_bits().to_string());
+    name.push('/');
+    name.push_str(&n2.get_nb_bits().to_string());
+
+    c.bench_function(name.as_str(), |b| b.iter(|| black_box(&n1 / &n2)));
+}
+
+criterion_group!(
+    biguint_div,
+    div<30, 3>,
+    div<100, 9>,
+    div<300, 27>,
+    div<1_000, 92>,
+    div<3_000, 287>,
+    div<10_000, 1001>,
+    div<30_000, 3027>,
+);
+
+criterion_main!(
+    biguint_add,
+    biguint_sub,
+    biguint_mul,
+    biguint_asymetric_mul,
+    biguint_div
+);
diff --git a/benches/bencher/src/lib.rs b/benches/bencher/src/lib.rs
@@ -0,0 +1,56 @@
+#![allow(unused_imports)]
+use rand::distributions::Standard;
+use rand::prelude::*;
+
+#[cfg(not(any(feature = "twibint", feature = "num-bigint")))]
+compile_error!("Exactly one feature must be used");
+
+#[cfg(all(feature = "twibint", feature = "num-bigint"))]
+compile_error!("Exactly one feature must be used");
+
+#[cfg(feature = "twibint")]
+pub fn gen_random_biguint(n: usize) -> twibint::BigUint<u64> {
+    let ret = twibint::gen_random_biguint::<u64>(n);
+    assert_eq!(ret.nb_bits(), n);
+    ret
+}
+
+#[cfg(feature = "num-bigint")]
+pub fn gen_random_biguint(n: usize) -> num_bigint::BigUint {
+    use num_bigint::RandBigInt;
+
+    let n: u64 = n.try_into().unwrap();
+
+    let mut rng = rand::thread_rng();
+    let mut ret = rng.gen_biguint(n);
+    let nb_bits = ret.bits();
+
+    if nb_bits == 0 {
+        ret = num_bigint::BigUint::from(1u32) << (n - 1);
+    } else if nb_bits > n {
+        ret >>= nb_bits - n;
+    } else if nb_bits < n {
+        ret <<= n - nb_bits;
+    }
+
+    assert_eq!(ret.bits(), n);
+    ret
+}
+
+pub trait GetNbBits {
+    fn get_nb_bits(&self) -> usize;
+}
+
+#[cfg(feature = "twibint")]
+impl GetNbBits for twibint::BigUint<u64> {
+    fn get_nb_bits(&self) -> usize {
+        self.nb_bits()
+    }
+}
+
+#[cfg(feature = "num-bigint")]
+impl GetNbBits for num_bigint::BigUint {
+    fn get_nb_bits(&self) -> usize {
+        self.bits().try_into().unwrap()
+    }
+}