Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[体验优化] 整合训练的CUDA和Triton算子为 paddlenlp_kernel #9471

Merged
merged 18 commits into from
Dec 23, 2024

Conversation

JunnYu
Copy link
Member

@JunnYu JunnYu commented Nov 21, 2024

PR types

New features

PR changes

APIs

Description

新增 paddlenlp_kernel wheel包,整合训练过程中用到的CUDA以及Triton算子
目前支持:

  • mamba1 && mamba2 算子
  • fast_ln && fused_ln 算子
  • fused_linear_cross_entropy 算子
  • paddle的反向算子:如 flash attn,flash mask,add,matmul
  • inf_cl 算子

编译 cuda 算子

cd csrc
rm -rf build dist *.egg-info  # 清理之前的构建文件和目录
python setup.py build  # 开始编译

打包 wheel 包

完成 CUDA 算子的编译后,接下来打包成 Wheel 包以便安装:

python setup.py bdist_wheel

安装 wheel 包

使用 pip 命令安装刚刚打包好的 Wheel 包:

pip install dist/*.whl

Copy link

paddle-bot bot commented Nov 21, 2024

Thanks for your contribution!

Copy link

codecov bot commented Nov 21, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 52.79%. Comparing base (25415fb) to head (a04be9f).
Report is 12 commits behind head on develop.

Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #9471      +/-   ##
===========================================
- Coverage    53.16%   52.79%   -0.37%     
===========================================
  Files          718      718              
  Lines       113862   112252    -1610     
===========================================
- Hits         60532    59263    -1269     
+ Misses       53330    52989     -341     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@JunnYu JunnYu requested a review from DrownFish19 December 19, 2024 07:31
@JunnYu JunnYu requested a review from ZHUI December 20, 2024 02:38
@JunnYu JunnYu changed the title [体验优化] 整合训练的CUDA和Triton算子为 paddlenlp_gpu_ops [体验优化] 整合训练的CUDA和Triton算子为 ppnlp_kernel Dec 20, 2024
@JunnYu JunnYu changed the title [体验优化] 整合训练的CUDA和Triton算子为 ppnlp_kernel [体验优化] 整合训练的CUDA和Triton算子为 paddlenlp_kernel Dec 20, 2024
@ZHUI ZHUI merged commit 1842d6d into PaddlePaddle:develop Dec 23, 2024
10 of 12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants