-
Notifications
You must be signed in to change notification settings - Fork 98
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feature: mixed-radix NTT fast twiddles mode #382
Conversation
@@ -116,7 +123,7 @@ namespace ntt { | |||
int tid = blockDim.x * blockIdx.x + threadIdx.x; | |||
if (tid >= n_elements * batch_size) return; | |||
int64_t scalar_id = tid % n_elements; | |||
if (rev_type != eRevType::None) scalar_id = generalized_rev(tid, logn, dit, rev_type); | |||
if (rev_type != eRevType::None) scalar_id = generalized_rev(tid, logn, dit, false, rev_type); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't we be passing fast_tw as a parameter here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's the kernel that handles cosets but it doesn't work with the fast-twiddles. That's why I removed the parameter and pass false. Otherwise it would calculate incorrectly.
Update: actually I am not sure why it doesn't work with the fast twiddles. I did not debug.
bb42693
to
8f36c1c
Compare
- this mode is allocating additional 4N twiddles to achieve faster computation Co-authored-by: hadaringonyama <hadar@ingonyama.com>
… and device arrays
8f36c1c
to
9a2957b
Compare
…mean it is initialized but that it is allocated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, looks good. Minor comments, can be ignored
…anyway with one mutex" This reverts commit 77a220c.
# Contents of this release Examples: multi-gpu example #381 Examples: updates example compares Radix2 and MixedRadix NTTs #383 Feat: add vector operations bindings to Rust #384 Examples: update examples with new vec ops #388 Feat: Grumpkin curve implementation #379 Feat: mixed-radix NTT fast twiddles mode #382 Docs: Update README.md #385 #387 README: Update Hall of Fame section #394 Examples: add rust poseidon example #392 Feat: GoLang bindings for v1.x #386
This mode is allocating 4N additional twiddle-factors (where N is max NTT size) for faster compute kernels.