-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Build] update version of "cutlass" #19891
Comments
yes, I encountered this as well. there seems to be a bug in cutlass 3.1 latest cutlass seems to have it fixed however when I update to cutlass 3.4.1 and retry build, I encounter other build errors which version of cutlass did you use to get a successful build? |
"cutlass" compilation was against |
just to confirm, you had to update cutlass to main AND also suppress warnings as error to get the build to succeed? |
This issue has been automatically marked as stale due to inactivity and will be closed in 30 days if no further activity occurs. If further support is needed, please provide an update and/or more details. |
Do we need to upgrade cutlass version? |
Yes, it need upgrade cutlass version to 3.5 and need some code change in ORT. I'm working on that. |
### Description Upgrade cutlass to 3.5 to fix build errors using CUDA 12.4 or 12.5 in Windows - [x] Upgrade cutlass to 3.5.0. - [x] Fix flash attention build error with latest cutlass header files and APIs. This fix is provided by @wangyems. - [x] Update efficient attention to use new cutlass fmha interface. - [x] Patch cutlass to fix `hrsqrt` not found error for sm < 53. - [x] Disable TF32 Staged Accumulation to fix blkq4_fp16_gemm_sm80_test build error for cuda 11.8 to 12.3. - [x] Disable TRT 10 deprecate warnings. The following are not included in this PR: * TRT provider replaces the deprecated APIs. * Fix blkq4_fp16_gemm_sm80_test build error for cuda 12.4 or 12.5. This test is not built by default unless you add `--cmake_extra_defines onnxruntime_ENABLE_CUDA_EP_INTERNAL_TESTS=ON` in build command. To integrate to rel-1.18.1: Either bring in other changes (like onnx 1.16.1), or generate manifest and upload a new ONNX Runtime Build Time Deps artifact based on rel-1.18.1. ### Motivation and Context #19891 #20924 #20953
Describe the issue
Triton team facing build issue trying to compile ONNX Runtime.
That issue observed against CUDA 12.4 and "cutlass" v3.1.0.
Note
--compile_no_warning_as_error
Urgency
No response
Target platform
WIN32
Build script
Error / output
Visual Studio Version
BUILDTOOLS_VERSION:17.9.34622.214 CMAKE_VERSION:3.27.1 CUDA_VERSION:12.4.0 CUDNN_VERSION:9.0.0.312 PYTHON_VERSION:3.8.10 TENSORRT_VERSION:8.6.1.6 VCPGK_VERSION:2023.11.20
GCC / Compiler Version
No response
The text was updated successfully, but these errors were encountered: