Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CUTLASS 3.3.0 #1167

Merged
merged 2 commits into from
Nov 2, 2023
Merged

CUTLASS 3.3.0 #1167

merged 2 commits into from
Nov 2, 2023

Conversation

IonThruster
Copy link
Collaborator

@IonThruster IonThruster commented Nov 2, 2023

  • Support for mixed input GEMMs on Hopper and Ampere.
  • Support for < 16B aligned tensors in SM90 GEMMs.
  • Enhancements to EVT & support for new fusions (dRELU, dBias etc.).
  • Enhancements to CUTLASS Python interface.
  • Enhancements to sub-byte type functionality in CuTe.
  • Support for clang as host compiler
  • Several other bug-fixes and performance improvements.

Adds support for mixed precision GEMMs On Hopper and Ampere
Adds support for < 16B aligned GEMMs on Hopper
Enhancements to EVT
Enhancements to Python interface
Enhancements to Sub-byte type handling in CuTe
Several other bug-fixes and performance improvements.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants