Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ADVZ: drop the restriction that recovery_threshold, multiplicity must be powers of two #668

Open
ggutoski opened this issue Aug 23, 2024 · 0 comments

Comments

@ggutoski
Copy link
Contributor

Currently Advz construction enforces that recovery_threshold and multiplicity must be powers of two:

jellyfish/vid/src/advz.rs

Lines 176 to 185 in 7cd4f76

// TODO TEMPORARY: enforce power-of-2 chunk size
// Remove this restriction after we get KZG in eval form
// https://github.com/EspressoSystems/jellyfish/issues/339
if chunk_size as usize != eval_domain.size() {
return Err(VidError::Argument(format!(
"recovery_threshold {} currently unsupported, round to {} instead",
chunk_size,
eval_domain.size()
)));
}

Code comments link to #339 in several places but do not explain why this issue blocks us from allowing power-of-two args in ADVZ.

Why the restriction?

In ADVZ we encode payload data into polynomials in evaluation form so as to facilitate payload proofs (aka "namespace proofs") in payload_prover.rs. We do not yet have KZG in eval form (#339) so instead we use FFT to compute polynomial coefficients from payload data, then use KZG in coefficient form for payload proofs. Ten months ago in internal discussion I said (paraphrasing):

A limitation of this approach is that payload recovery is possible only if recovery_threshold * multiplicity is a power of two. Why? Because FFT is applied to all polynomials, which effectively "rounds up" the degree d to next_power_of_2(d). Interpolating such a polynomial requires next_power_of_2(d) points.

This is wrong. A degree-d polynomial is padded with zero until its degree becomes next_power_of_2(d). Yes, you need next_power_of_2(d) points to interpolate such a polynomial, but we already know that next_power_of_2(d) - d of those points are zero, so we only need d additional points to interpolate. Thus, the payload is recoverable under this approach for any desired degree d.

10 months ago I also said "this limitation can be removed after we have a proper implementation of KZG in eval form." But of course that's incorrect---we do not need KZG in eval form in order to remove the power-of-two restriction.

Efficiency concerns

The FFT operates only on data sets whose length is a power of two. Thus, if we use it on a non-power-of-two data set the implementation must waste computation: we pad a size-d data set with zeros until its size is next_power_of_2(d), do FFTs, then discard excess data back down to size d. This is wasteful, but it's not a security issue.

Also, ADVZ dispersal uses FK23 algorithm to efficiently compute many KZG proofs. FK23 is a FFT-like algorithm that operates only on data sets whose length is a power of two, so we introduce a similar inefficiency in FK23 if we operate on non-power-of-two data sets. But again, this inefficiency does not introduce a security concern: it's safe to use non-power-of-two data sets.

TODO

  • short term: explain in docs. future code should link to this issue in comments
  • long term: remove the power-of-two restriction in such a way that payload recovery is possible for all payload data sizes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants