Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[NPU][LLM] add README & reformat llama scripts #8642

Merged
merged 4 commits into from
Jun 22, 2024

Conversation

SylarTiaNII
Copy link
Contributor

PR types

Others

PR changes

Others

Description

add README for llama training/inference on NPU(910)

Copy link

paddle-bot bot commented Jun 20, 2024

Thanks for your contribution!

Copy link
Contributor

@ronny1996 ronny1996 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link

codecov bot commented Jun 20, 2024

Codecov Report

Attention: Patch coverage is 0% with 1 line in your changes missing coverage. Please review.

Project coverage is 55.81%. Comparing base (65e721e) to head (5419849).

Current head 5419849 differs from pull request most recent head 7ab4a98

Please upload reports for the commit 7ab4a98 to get more accurate results.

Files Patch % Lines
paddlenlp/transformers/mc2_parallel_linear.py 0.00% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #8642      +/-   ##
===========================================
+ Coverage    55.80%   55.81%   +0.01%     
===========================================
  Files          620      620              
  Lines        96642    96599      -43     
===========================================
- Hits         53928    53917      -11     
+ Misses       42714    42682      -32     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@onecatcn
Copy link

LGTM

@@ -0,0 +1,198 @@
## 🚣‍♂️ 使用PaddleNLP在NPU下跑通llama2-13b模型 🚣
PaddleNLP在昇腾NPU([了解昇腾](https://www.hiascend.com/zh/ecosystem/industry))上对llama2-13B模型进行了深度适配和优化,该套件实现了昇腾NPU和GPU的训推入口基本统一,达到了『无缝切换』的效果。
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://github.com/PaddlePaddle/PaddleNLP/tree/develop/llm/npu/llama

文件放到 llm/npu 下面,我们重构的pr已经合入了

}
```
为了保障极致压缩的推理成本,我们使用了静态图实现。因此需要从训练产出的动态图模型中导出静态图模型,执行如下命令进行导出:
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

改为

cd PaddleNLP/llm
python predict/export_model.py --model_name_or_path merged_model  --inference_model --output_path ./inference --dtype float16  --device npu  --block_attn

当前看必须在 PaddleNLP/llm 才跑得通

```
最终,我们通过静态图的模型执行推理:
```
# 执行推理代码
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

改为

# 执行推理代码
python predict/predictor.py  --model_name_or_path ./inference --inference_model --dtype "float16" --mode "static" --block_attn --device npu

@ronny1996
Copy link
Contributor

https://github.com/PaddlePaddle/PaddleNLP/blob/develop/llm/predict/export_model.py#L104 这里帮忙改成

from npu.llama.export_utils import process_params

@SylarTiaNII SylarTiaNII force-pushed the reformat_npu_llama branch 2 times, most recently from 43c883d to f9f424c Compare June 21, 2024 12:23
source /usr/local/Ascend/ascend-toolkit/set_env.sh
source /usr/local/Ascend/atb/set_env.sh

export PYTHONPATH=../:$PYTHONPATH
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

刚到 llm/npu


model_path=${1:-"./inference"}

source /usr/local/Ascend/ascend-toolkit/set_env.sh
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

放到llm/npu

@ZeyuChen ZeyuChen merged commit 25d2140 into PaddlePaddle:develop Jun 22, 2024
5 of 9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants