[NPU][LLM] add README & reformat llama scripts #8642

SylarTiaNII · 2024-06-20T10:40:13Z

PR types

Others

PR changes

Others

Description

add README for llama training/inference on NPU(910)

paddle-bot · 2024-06-20T10:40:18Z

Thanks for your contribution!

ronny1996

LGTM

codecov · 2024-06-20T11:11:16Z

Codecov Report

Attention: Patch coverage is 0% with 1 line in your changes missing coverage. Please review.

Project coverage is 55.81%. Comparing base (65e721e) to head (5419849).

❗ Current head 5419849 differs from pull request most recent head 7ab4a98

Please upload reports for the commit 7ab4a98 to get more accurate results.

Files	Patch %	Lines
paddlenlp/transformers/mc2_parallel_linear.py	0.00%	1 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff             @@
##           develop    #8642      +/-   ##
===========================================
+ Coverage    55.80%   55.81%   +0.01%     
===========================================
  Files          620      620              
  Lines        96642    96599      -43     
===========================================
- Hits         53928    53917      -11     
+ Misses       42714    42682      -32

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

onecatcn · 2024-06-20T12:21:55Z

LGTM

ZHUI · 2024-06-21T03:14:44Z

llm/llama/npu/README.md

@@ -0,0 +1,198 @@
+## 🚣‍♂️ 使用PaddleNLP在NPU下跑通llama2-13b模型 🚣
+PaddleNLP在昇腾NPU（[了解昇腾](https://www.hiascend.com/zh/ecosystem/industry)）上对llama2-13B模型进行了深度适配和优化，该套件实现了昇腾NPU和GPU的训推入口基本统一，达到了『无缝切换』的效果。


https://github.com/PaddlePaddle/PaddleNLP/tree/develop/llm/npu/llama

文件放到 llm/npu 下面，我们重构的pr已经合入了

ronny1996 · 2024-06-21T05:25:46Z

llm/llama/npu/README.md

+}
+```
+为了保障极致压缩的推理成本，我们使用了静态图实现。因此需要从训练产出的动态图模型中导出静态图模型，执行如下命令进行导出：
+```


改为

cd PaddleNLP/llm python predict/export_model.py --model_name_or_path merged_model --inference_model --output_path ./inference --dtype float16 --device npu --block_attn

当前看必须在 PaddleNLP/llm 才跑得通

ronny1996 · 2024-06-21T05:26:56Z

llm/llama/npu/README.md

+```
+最终，我们通过静态图的模型执行推理：
+```
+# 执行推理代码


改为

# 执行推理代码 python predict/predictor.py --model_name_or_path ./inference --inference_model --dtype "float16" --mode "static" --block_attn --device npu

ronny1996 · 2024-06-21T05:35:41Z

https://github.com/PaddlePaddle/PaddleNLP/blob/develop/llm/predict/export_model.py#L104 这里帮忙改成

from npu.llama.export_utils import process_params

ZHUI · 2024-06-22T02:26:58Z

llm/export_npu.sh

+source /usr/local/Ascend/ascend-toolkit/set_env.sh
+source /usr/local/Ascend/atb/set_env.sh
+
+export PYTHONPATH=../:$PYTHONPATH


刚到 llm/npu

ZHUI · 2024-06-22T02:27:41Z

llm/predict_npu.sh

+
+model_path=${1:-"./inference"}
+
+source /usr/local/Ascend/ascend-toolkit/set_env.sh


放到llm/npu

[NPU][LLM] add README & reformat llama scripts

5419849

SylarTiaNII force-pushed the reformat_npu_llama branch from c511f7c to 5419849 Compare June 20, 2024 10:42

ronny1996 approved these changes Jun 20, 2024

View reviewed changes

ZHUI reviewed Jun 21, 2024

View reviewed changes

ronny1996 suggested changes Jun 21, 2024

View reviewed changes

SylarTiaNII added 2 commits June 21, 2024 19:31

[LLM][NPU] fix readme 0621

9b5f154

[LLM][NPU] merge develop

0331b4c

SylarTiaNII force-pushed the reformat_npu_llama branch 2 times, most recently from 43c883d to f9f424c Compare June 21, 2024 12:23

ZHUI requested changes Jun 22, 2024

View reviewed changes

[LLM][NPU] reformat npu scripts

7ab4a98

SylarTiaNII force-pushed the reformat_npu_llama branch from f9f424c to 7ab4a98 Compare June 22, 2024 03:36

ZHUI approved these changes Jun 22, 2024

View reviewed changes

ZeyuChen merged commit 25d2140 into PaddlePaddle:develop Jun 22, 2024
5 of 9 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[NPU][LLM] add README & reformat llama scripts #8642

[NPU][LLM] add README & reformat llama scripts #8642

SylarTiaNII commented Jun 20, 2024

paddle-bot bot commented Jun 20, 2024

ronny1996 left a comment

codecov bot commented Jun 20, 2024 •

edited

Loading

onecatcn commented Jun 20, 2024

ZHUI Jun 21, 2024

ronny1996 Jun 21, 2024

ronny1996 Jun 21, 2024

ronny1996 commented Jun 21, 2024

ZHUI Jun 22, 2024

ZHUI Jun 22, 2024

		@@ -0,0 +1,198 @@
		## 🚣‍♂️ 使用PaddleNLP在NPU下跑通llama2-13b模型 🚣
		PaddleNLP在昇腾NPU（[了解昇腾](https://www.hiascend.com/zh/ecosystem/industry)）上对llama2-13B模型进行了深度适配和优化，该套件实现了昇腾NPU和GPU的训推入口基本统一，达到了『无缝切换』的效果。


		model_path=${1:-"./inference"}

		source /usr/local/Ascend/ascend-toolkit/set_env.sh

[NPU][LLM] add README & reformat llama scripts #8642

[NPU][LLM] add README & reformat llama scripts #8642

Conversation

SylarTiaNII commented Jun 20, 2024

PR types

PR changes

Description

paddle-bot bot commented Jun 20, 2024

ronny1996 left a comment

Choose a reason for hiding this comment

codecov bot commented Jun 20, 2024 • edited Loading

Codecov Report

onecatcn commented Jun 20, 2024

ZHUI Jun 21, 2024

Choose a reason for hiding this comment

ronny1996 Jun 21, 2024

Choose a reason for hiding this comment

ronny1996 Jun 21, 2024

Choose a reason for hiding this comment

ronny1996 commented Jun 21, 2024

ZHUI Jun 22, 2024

Choose a reason for hiding this comment

ZHUI Jun 22, 2024

Choose a reason for hiding this comment

codecov bot commented Jun 20, 2024 •

edited

Loading