Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial Shplonk prover/verifier implementation #301

Merged
merged 9 commits into from
Feb 13, 2024
Merged

Initial Shplonk prover/verifier implementation #301

merged 9 commits into from
Feb 13, 2024

Conversation

storojs72
Copy link
Member

@storojs72 storojs72 commented Feb 8, 2024

This PR adds Shplonk-based optimisation to HyperKZG PCS. For simplifying the reviewing process and focusing on correctness at first, It is implemented as a separate PCS (one more implementation of EvaluationEngineTrait) and contains most of @adr1anh's suggestions kindly provided in both private discussions and during intermediate reviewing of draft version.

Once this PR is merged, I'm planning to have some more follow-up PRs to solve #302, #304 and #303.

Copy link
Member

@huitseeker huitseeker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot. I think we can have an initial side-by-side implementation in this PR for review, and once we're happy with the status of the PR, we can add a last commit that moves the changes from shplonk to hyperkzg.

As long as we're taking this approach, would you mind adding shplonk to the PCS benchmarks?

@@ -61,18 +61,20 @@ where
E::Fr: TranscriptReprTrait<E::G1>,
E::G1Affine: TranscriptReprTrait<E::G1>, // TODO: this bound on DlogGroup is really unusable!
{
fn compute_challenge(
/// TODO: write doc
pub fn compute_challenge(
com: &[E::G1Affine],
transcript: &mut impl TranscriptEngineTrait<NE>,
) -> E::Fr {
transcript.absorb(b"c", &com.to_vec().as_slice());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can remove to_vec().as_slice(), the transcript already takes slices.

com: &[E::G1Affine],
transcript: &mut impl TranscriptEngineTrait<NE>,
) -> E::Fr {
transcript.absorb(b"c", &com.to_vec().as_slice());
transcript.squeeze(b"c").unwrap()
}

/// TODO: write doc
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the existing comment already :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, so I do not focus on this only because we are going to merge Shplonk with HyperKZG eventually. The followup PR or "final" commit that you mentioned will for sure include these TODO resolvings

@@ -88,14 +90,16 @@ where
transcript.squeeze(b"r").unwrap()
}

fn batch_challenge_powers(q: E::Fr, k: usize) -> Vec<E::Fr> {
/// TODO: write doc
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The inner comment should perhaps just be moved as a doc comment of this function. See also crate::spartan::powers, which could easily replace this function and its uses. We just need to move that function somewhere more accessible (the math module comes to mind).

Comment on lines 97 to 98
<NE::CE as CommitmentEngineTrait<NE>>::commit(ck, &polys[i])
.comm
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, you don't want to use to_affine() in a loop. Consider using batch_normalize to post process the projective points, see #290 for examples.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy to do that, but can you please explain the reason? As batch_normalize seems to iterate over vector of G1 elements and invoke to_affine for each similarly:

 fn batch_normalize(p: &[Self], q: &mut [Self::AffineRepr]) {
        assert_eq!(p.len(), q.len());

        for (p, q) in p.iter().zip(q.iter_mut()) {
            *q = p.to_affine();
        }
    }

Copy link
Member

@huitseeker huitseeker Feb 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

batch_normalize is part of the Curve trait, and what you're showing is the default implementation that kicks in if the implementer choses to not optimize it. In fact, in most cases, the implementer does implement something better, and so it does not work that way for the curves that interest us. In fact it's optimized in pasta:
https://github.com/zcash/pasta_curves/blob/df67e299e6644c035a213dbad369be279276919c/src/curves.rs#L173-L207
and for halo2curves:
https://github.com/privacy-scaling-explorations/halo2curves/blob/9fff22c5f72cc54fac1ef3a844e1072b08cfecdf/src/derive/curve.rs#L583-L614

Comment on lines +147 to +154
E::Fr: Serialize + DeserializeOwned,
E::G1Affine: Serialize + DeserializeOwned,
E::G2Affine: Serialize + DeserializeOwned,
E::G1: DlogGroup<ScalarExt = E::Fr, AffineExt = E::G1Affine>,
<E::G1 as Group>::Base: TranscriptReprTrait<E::G1>, // Note: due to the move of the bound TranscriptReprTrait<G> on G::Base from Group to Engine
E::Fr: PrimeFieldBits, // TODO due to use of gen_srs_for_testing, make optional
E::Fr: TranscriptReprTrait<E::G1>,
E::G1Affine: TranscriptReprTrait<E::G1>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's really a pity we have to go through this declaration slog 😦 The good news is those bounds are auto-satisfied (e.g. E::Fr : Serialize comes through NE: NovaEngine<Scalar = E::Fr>), which makes the type rather usable.

Copy link
Member Author

@storojs72 storojs72 Feb 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, our trait bounds are indeed quite complicated, so I just copy-pasted them from hyperkzg with minimal adaptations to satisfy compiler I think

Comment on lines 71 to 75
#[allow(clippy::needless_range_loop)]
Pi.par_iter_mut().enumerate().for_each(|(j, Pi_j)| {
*Pi_j =
point[point.len() - i - 1] * (polys[i][2 * j + 1] - polys[i][2 * j]) + polys[i][2 * j];
});
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aka:

      (0..Pi_len).into_par_iter().map(|j| {
        point[ell - i - 1] * (polys[i][2 * j + 1] - polys[i][2 * j]) + polys[i][2 * j]
      }).collect_into_vec(&mut Pi);

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Indeed cleaner and doesn't require adding clippy exception

// K(x) = P(x) - Q(x) * D(a) - R(a), note that R(a) should be subtracted from a free term of polynomial
let K_x = Self::compute_k_polynomial(&batched_Pi, &Q_x, &D, &R_x, a);

// TODO: since this is a usual KZG10 we should use it as utility instead
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note the very important definition:
https://github.com/lurk-lab/arecibo/blob/7a5d7bf0e9be80b8b5c4d47bb99c67c2b058b35e/src/provider/non_hiding_kzg.rs#L224-L225
IOW, you're invoking this as a UVKZGPoly, but really, you can use UniPoly everywhere instead. I should perhaps not have made this type alias public.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this is true for Zeromorph as well, right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes

let q = HyperKZG::<E, NE>::get_batch_challenge(&pi.evals, transcript);
let q_powers = HyperKZG::<E, NE>::batch_challenge_powers(q, pi.comms.len());

let R_x = UVKZGPoly::new(pi.R_x.clone());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same remark, this is a UniPoly morally speaking.

Comment on lines 268 to 274
let mut batched_eval = E::Fr::ZERO;
evals_i
.iter()
.zip_eq(q_powers.iter())
.for_each(|(eval, q_i)| {
batched_eval += *eval * q_i;
});
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this is UniPoly::ref_cast(evals_i).evaluate(q), eliminating the need for q_powers.

Comment on lines 307 to 310
let C_P = q_powers
.iter()
.zip_eq(pi.comms.iter())
.fold(E::G1::identity(), |acc, (q_i, C_i)| acc + *C_i * q_i);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like pi_comms.iter().rlc(q) I think.

@storojs72
Copy link
Member Author

My draft benchmarking experiments (with "hacked" parallelism - I'm looking forward review comments from @adr1anh about that) show that with Shplonk we should have slightly better prover and comparable verifier (to current HyperKZG):

PCS-Proving 10/arecibo::provider::hyperkzg::EvaluationEngine<halo2curves::bn256::engine::Bn256, arec...                                                                           
                        time:   [7.3302 ms 7.4526 ms 7.6260 ms]
                        change: [-0.2125% +3.2892% +6.3721%] (p = 0.08 > 0.05)
                        No change in performance detected.
PCS-Proving 10/arecibo::provider::shplonk::EvaluationEngine<halo2curves::bn256::engine::Bn256, areci...                                                                           
                        time:   [6.3622 ms 6.5637 ms 6.7723 ms]
                        change: [+4.2467% +6.2231% +8.5170%] (p = 0.00 < 0.05)
                        Performance has regressed.

PCS-Verifying 10/arecibo::provider::hyperkzg::EvaluationEngine<halo2curves::bn256::engine::Bn256, ar...                                                                            
                        time:   [2.0429 ms 2.0463 ms 2.0514 ms]
                        change: [-16.865% -9.5600% -3.2519%] (p = 0.01 < 0.05)
                        Performance has improved.
PCS-Verifying 10/arecibo::provider::shplonk::EvaluationEngine<halo2curves::bn256::engine::Bn256, are...                                                                            
                        time:   [2.0973 ms 2.0994 ms 2.1019 ms]
                        change: [+0.8078% +0.9705% +1.1206%] (p = 0.00 < 0.05)
                        Change within noise threshold.

PCS-Proving 12/arecibo::provider::hyperkzg::EvaluationEngine<halo2curves::bn256::engine::Bn256, arec...                                                                           
                        time:   [16.870 ms 16.887 ms 16.924 ms]
                        change: [-6.7220% -4.2327% -1.8784%] (p = 0.00 < 0.05)
                        Performance has improved.
PCS-Proving 12/arecibo::provider::shplonk::EvaluationEngine<halo2curves::bn256::engine::Bn256, areci...                                                                           
                        time:   [15.260 ms 15.379 ms 15.464 ms]
                        change: [+0.0949% +0.5357% +1.0399%] (p = 0.05 > 0.05)
                        No change in performance detected.

PCS-Verifying 12/arecibo::provider::hyperkzg::EvaluationEngine<halo2curves::bn256::engine::Bn256, ar...                                                                            
                        time:   [2.0975 ms 2.1340 ms 2.1669 ms]
                        change: [-2.4215% -1.4423% -0.3105%] (p = 0.03 < 0.05)
                        Change within noise threshold.
PCS-Verifying 12/arecibo::provider::shplonk::EvaluationEngine<halo2curves::bn256::engine::Bn256, are...                                                                            
                        time:   [2.1346 ms 2.1549 ms 2.1740 ms]
                        change: [+2.5498% +4.3679% +6.4286%] (p = 0.00 < 0.05)
                        Performance has regressed.

PCS-Proving 14/arecibo::provider::hyperkzg::EvaluationEngine<halo2curves::bn256::engine::Bn256, arec...                                                                           
                        time:   [49.327 ms 50.146 ms 51.284 ms]
                        change: [+0.8152% +2.4361% +4.0370%] (p = 0.01 < 0.05)
                        Change within noise threshold.
PCS-Proving 14/arecibo::provider::shplonk::EvaluationEngine<halo2curves::bn256::engine::Bn256, areci...                                                                           
                        time:   [42.929 ms 44.007 ms 44.668 ms]
                        change: [+1.5793% +3.5868% +5.5920%] (p = 0.00 < 0.05)
                        Performance has regressed.

PCS-Verifying 14/arecibo::provider::hyperkzg::EvaluationEngine<halo2curves::bn256::engine::Bn256, ar...                                                                            
                        time:   [2.1705 ms 2.1902 ms 2.2028 ms]
                        change: [-3.7569% -2.3103% -0.8618%] (p = 0.01 < 0.05)
                        Change within noise threshold.
PCS-Verifying 14/arecibo::provider::shplonk::EvaluationEngine<halo2curves::bn256::engine::Bn256, are...                                                                            
                        time:   [2.1224 ms 2.3098 ms 2.8080 ms]
                        change: [+0.4765% +10.268% +27.970%] (p = 0.22 > 0.05)
                        No change in performance detected.

Benchmarking PCS-Proving 16/arecibo::provider::hyperkzg::EvaluationEngine<halo2curves::bn256::engine::Bn256, arec...: Warming up for 3.0000 s
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 9.1s or enable flat sampling.
PCS-Proving 16/arecibo::provider::hyperkzg::EvaluationEngine<halo2curves::bn256::engine::Bn256, arec...                                                                          
                        time:   [162.43 ms 163.87 ms 166.55 ms]
                        change: [-3.7779% -1.2925% +1.0884%] (p = 0.35 > 0.05)
                        No change in performance detected.
Benchmarking PCS-Proving 16/arecibo::provider::shplonk::EvaluationEngine<halo2curves::bn256::engine::Bn256, areci...: Warming up for 3.0000 s
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 8.0s or enable flat sampling.
PCS-Proving 16/arecibo::provider::shplonk::EvaluationEngine<halo2curves::bn256::engine::Bn256, areci...                                                                          
                        time:   [143.59 ms 145.72 ms 147.12 ms]
                        change: [-6.0379% -3.3437% -0.5428%] (p = 0.04 < 0.05)
                        Change within noise threshold.

PCS-Verifying 16/arecibo::provider::hyperkzg::EvaluationEngine<halo2curves::bn256::engine::Bn256, ar...                                                                            
                        time:   [2.0650 ms 2.0662 ms 2.0678 ms]
                        change: [-3.5172% -2.2347% -1.0415%] (p = 0.00 < 0.05)
                        Performance has improved.
PCS-Verifying 16/arecibo::provider::shplonk::EvaluationEngine<halo2curves::bn256::engine::Bn256, are...                                                                            
                        time:   [2.1881 ms 2.1910 ms 2.1935 ms]
                        change: [-0.3295% -0.1397% +0.0561%] (p = 0.19 > 0.05)
                        No change in performance detected.

Benchmarking PCS-Proving 18/arecibo::provider::hyperkzg::EvaluationEngine<halo2curves::bn256::engine::Bn256, arec...: Warming up for 3.0000 s
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 6.3s.
PCS-Proving 18/arecibo::provider::hyperkzg::EvaluationEngine<halo2curves::bn256::engine::Bn256, arec...                                                                          
                        time:   [653.13 ms 675.68 ms 698.05 ms]
                        change: [+6.9224% +10.838% +14.568%] (p = 0.00 < 0.05)
                        Performance has regressed.
Benchmarking PCS-Proving 18/arecibo::provider::shplonk::EvaluationEngine<halo2curves::bn256::engine::Bn256, areci...: Warming up for 3.0000 s
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 5.1s.
PCS-Proving 18/arecibo::provider::shplonk::EvaluationEngine<halo2curves::bn256::engine::Bn256, areci...                                                                          
                        time:   [494.35 ms 512.11 ms 533.28 ms]
                        change: [-4.8880% -1.3660% +3.2409%] (p = 0.55 > 0.05)
                        No change in performance detected.

PCS-Verifying 18/arecibo::provider::hyperkzg::EvaluationEngine<halo2curves::bn256::engine::Bn256, ar...                                                                            
                        time:   [2.1195 ms 2.1319 ms 2.1556 ms]
                        change: [+0.0758% +0.7821% +1.5436%] (p = 0.05 > 0.05)
                        No change in performance detected.
PCS-Verifying 18/arecibo::provider::shplonk::EvaluationEngine<halo2curves::bn256::engine::Bn256, are...                                                                            
                        time:   [2.2969 ms 2.3197 ms 2.3434 ms]
                        change: [+0.5188% +1.6314% +2.6774%] (p = 0.01 < 0.05)
                        Change within noise threshold.

Benchmarking PCS-Proving 20/arecibo::provider::hyperkzg::EvaluationEngine<halo2curves::bn256::engine::Bn256, arec...: Warming up for 3.0000 s
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 23.9s.
PCS-Proving 20/arecibo::provider::hyperkzg::EvaluationEngine<halo2curves::bn256::engine::Bn256, arec...                                                                          
                        time:   [2.3023 s 2.3299 s 2.3594 s]
Benchmarking PCS-Proving 20/arecibo::provider::shplonk::EvaluationEngine<halo2curves::bn256::engine::Bn256, areci...: Warming up for 3.0000 s
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 21.9s.
PCS-Proving 20/arecibo::provider::shplonk::EvaluationEngine<halo2curves::bn256::engine::Bn256, areci...                                                                          
                        time:   [2.1019 s 2.1146 s 2.1299 s]
                        change: [-14.905% -8.0909% -1.9854%] (p = 0.02 < 0.05)
                        Performance has improved.

PCS-Verifying 20/arecibo::provider::hyperkzg::EvaluationEngine<halo2curves::bn256::engine::Bn256, ar...                                                                            
                        time:   [2.1501 ms 2.1513 ms 2.1531 ms]
PCS-Verifying 20/arecibo::provider::shplonk::EvaluationEngine<halo2curves::bn256::engine::Bn256, are...                                                                            
                        time:   [2.2794 ms 2.2872 ms 2.2942 ms]

Despite the MSM elimination at verifier side, in shplonk we still have heavy C_P computing.

@huitseeker
Copy link
Member

huitseeker commented Feb 9, 2024

Despite the MSM elimination at verifier side, in shplonk we still have heavy C_P computing

I have an upcoming branch that may help with that (https://github.com/huitseeker/arecibo/tree/rayon-parscan : warning, very WIP)

Copy link
Member

@huitseeker huitseeker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM, great base to keep iterating!

@storojs72 storojs72 added this pull request to the merge queue Feb 13, 2024
Merged via the queue into dev with commit 4cc09c8 Feb 13, 2024
9 checks passed
@storojs72 storojs72 deleted the shplonk branch February 13, 2024 17:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants