Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature: mixed-radix NTT fast twiddles mode #382

Merged
merged 7 commits into from
Feb 21, 2024
Merged

feature: mixed-radix NTT fast twiddles mode #382

merged 7 commits into from
Feb 21, 2024

Conversation

yshekel
Copy link
Collaborator

@yshekel yshekel commented Feb 18, 2024

This mode is allocating 4N additional twiddle-factors (where N is max NTT size) for faster compute kernels.

@@ -116,7 +123,7 @@ namespace ntt {
int tid = blockDim.x * blockIdx.x + threadIdx.x;
if (tid >= n_elements * batch_size) return;
int64_t scalar_id = tid % n_elements;
if (rev_type != eRevType::None) scalar_id = generalized_rev(tid, logn, dit, rev_type);
if (rev_type != eRevType::None) scalar_id = generalized_rev(tid, logn, dit, false, rev_type);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't we be passing fast_tw as a parameter here?

Copy link
Collaborator Author

@yshekel yshekel Feb 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's the kernel that handles cosets but it doesn't work with the fast-twiddles. That's why I removed the parameter and pass false. Otherwise it would calculate incorrectly.

Update: actually I am not sure why it doesn't work with the fast twiddles. I did not debug.

@yshekel yshekel force-pushed the fast-twiddles branch 4 times, most recently from bb42693 to 8f36c1c Compare February 21, 2024 15:55
yshekel and others added 3 commits February 21, 2024 18:33
- this mode is allocating additional 4N twiddles to achieve faster computation

Co-authored-by: hadaringonyama <hadar@ingonyama.com>
Copy link
Contributor

@DmytroTym DmytroTym left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, looks good. Minor comments, can be ignored

icicle/appUtils/ntt/ntt.cuh Outdated Show resolved Hide resolved
icicle/appUtils/ntt/kernel_ntt.cu Show resolved Hide resolved
icicle/appUtils/ntt/thread_ntt.cu Show resolved Hide resolved
@yshekel yshekel merged commit 275b2f4 into dev Feb 21, 2024
14 checks passed
@yshekel yshekel deleted the fast-twiddles branch February 21, 2024 22:02
@jeremyfelder jeremyfelder mentioned this pull request Feb 22, 2024
@DmytroTym DmytroTym mentioned this pull request Feb 22, 2024
4 tasks
jeremyfelder added a commit that referenced this pull request Feb 23, 2024
# Contents of this release

Examples: multi-gpu example #381
Examples: updates example compares Radix2 and MixedRadix NTTs #383
Feat: add vector operations bindings to Rust #384 
Examples: update examples with new vec ops #388 
Feat: Grumpkin curve implementation #379 
Feat: mixed-radix NTT fast twiddles mode #382 
Docs: Update README.md #385 #387 
README: Update Hall of Fame section #394 
Examples: add rust poseidon example #392 
Feat: GoLang bindings for v1.x #386
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants