Add fused linear for the LLAMA MLP block and multi-head attention block #6425

littsk · 2023-07-18T07:35:50Z

PR types

New features

PR changes

Models

Description

Add fused linear operation for MLP and multi-head attention in LLAMA.

paddle-bot · 2023-07-18T07:35:55Z

Thanks for your contribution!

examples/language_model/llama/run_pretrain.py

ZHUI · 2023-07-18T07:59:33Z

paddlenlp/transformers/llama/modeling.py

-                gather_output=False,
-            )
+            if self.fuse_attn_qkv:
+                self.qkv_proj = mpu.ColumnParallelLinear(


预训练参数加载转换，看看能不能根据fuse 搞成自动的。

请问大佬，具体是啥意思呀

paddlenlp/transformers/llama/modeling.py

…improved readability and consistency.

codecov · 2023-07-19T12:50:25Z

Codecov Report

Merging #6425 (394608f) into develop (8e27802) will decrease coverage by 0.05%.
Report is 37 commits behind head on develop.
The diff coverage is 65.33%.

@@             Coverage Diff             @@
##           develop    #6425      +/-   ##
===========================================
- Coverage    63.18%   63.14%   -0.05%     
===========================================
  Files          529      529              
  Lines        77214    77315     +101     
===========================================
+ Hits         48789    48817      +28     
- Misses       28425    28498      +73

Files Changed	Coverage Δ
paddlenlp/taskflow/text_feature_extraction.py	`46.00% <50.00%> (+0.62%)`	⬆️
paddlenlp/transformers/llama/modeling.py	`70.86% <65.38%> (-5.30%)`	⬇️
paddlenlp/taskflow/task.py	`64.21% <100.00%> (ø)`
paddlenlp/taskflow/taskflow.py	`84.88% <100.00%> (ø)`
paddlenlp/transformers/llama/configuration.py	`100.00% <100.00%> (ø)`

... and 24 files with indirect coverage changes

paddlenlp/transformers/llama/modeling.py

sijunhe

我看ok.
@ZHUI 看下呢？

…he fused linear implementation.

paddlenlp/transformers/llama/modeling.py

ZHUI · 2023-07-25T11:58:57Z

paddlenlp/transformers/llama/modeling.py

-                gather_output=False,
-                has_bias=False,
-            )
+            # 为了减少张量并行的通信量，将两个linear合并成一个


中文注释删掉

ZHUI · 2023-07-25T12:00:10Z

paddlenlp/transformers/llama/modeling.py

-                has_bias=False,
-            )
+            # 为了减少张量并行的通信量，将两个linear合并成一个
+            if config.fuse_mlp_linear:


这个 fuse_mlp_linear 是不是换个名字更好？
fuse_gate_up_proj ? 或者其他？

好的好的

ZHUI

LGTM

FeixLiu

LGTM for the fuse

…ck (PaddlePaddle#6425) * Add fused linear for the LLAMA MLP block and multi-head attention block * Refactor the config name 'fuse_attn_qkv' to 'fuse_attention_qkv' for improved readability and consistency. * Added a switch for fuse_mlp_linear and improved the organization of the fused linear implementation. * add tensor parallel mappings for fused linears

ZHUI reviewed Jul 18, 2023

View reviewed changes

Add fused linear for the LLAMA MLP block and multi-head attention block

0c1a534

littsk force-pushed the fused_linear_feature branch 3 times, most recently from a460685 to 3672383 Compare July 19, 2023 12:12

Refactor the config name 'fuse_attn_qkv' to 'fuse_attention_qkv' for …

178e63b

…improved readability and consistency.

littsk force-pushed the fused_linear_feature branch from 3672383 to 178e63b Compare July 19, 2023 12:14

sijunhe reviewed Jul 21, 2023

View reviewed changes

paddlenlp/transformers/llama/modeling.py Outdated Show resolved Hide resolved

sijunhe requested changes Jul 21, 2023

View reviewed changes

paddlenlp/transformers/llama/modeling.py Outdated Show resolved Hide resolved

littsk force-pushed the fused_linear_feature branch from 9495473 to 32c36cb Compare July 24, 2023 05:47

sijunhe reviewed Jul 24, 2023

View reviewed changes

Added a switch for fuse_mlp_linear and improved the organization of t…

6412892

…he fused linear implementation.

littsk force-pushed the fused_linear_feature branch from 32c36cb to 6412892 Compare July 24, 2023 07:47

ZHUI reviewed Jul 25, 2023

View reviewed changes

paddlenlp/transformers/llama/modeling.py Show resolved Hide resolved

ZHUI previously approved these changes Jul 25, 2023

View reviewed changes

littsk dismissed ZHUI’s stale review via e1e2083 July 25, 2023 12:02

littsk force-pushed the fused_linear_feature branch 2 times, most recently from e1e2083 to 7927e54 Compare July 26, 2023 02:44

add tensor parallel mappings for fused linears

394608f

littsk force-pushed the fused_linear_feature branch from 7927e54 to 394608f Compare July 26, 2023 02:49

ZHUI approved these changes Jul 26, 2023

View reviewed changes

littsk requested review from sijunhe and FeixLiu July 26, 2023 02:59

FeixLiu approved these changes Jul 26, 2023

View reviewed changes

sijunhe approved these changes Jul 26, 2023

View reviewed changes

zjjlivein merged commit d1050f4 into PaddlePaddle:develop Jul 26, 2023

paddle-bot bot added the status: accepted label Jul 26, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add fused linear for the LLAMA MLP block and multi-head attention block #6425

Add fused linear for the LLAMA MLP block and multi-head attention block #6425

littsk commented Jul 18, 2023

paddle-bot bot commented Jul 18, 2023

ZHUI Jul 18, 2023

littsk Jul 18, 2023

codecov bot commented Jul 19, 2023 •

edited

Loading

sijunhe left a comment

ZHUI Jul 25, 2023

ZHUI Jul 25, 2023

littsk Jul 26, 2023

ZHUI left a comment

FeixLiu left a comment

Add fused linear for the LLAMA MLP block and multi-head attention block #6425

Add fused linear for the LLAMA MLP block and multi-head attention block #6425

Conversation

littsk commented Jul 18, 2023

PR types

PR changes

Description

paddle-bot bot commented Jul 18, 2023

ZHUI Jul 18, 2023

Choose a reason for hiding this comment

littsk Jul 18, 2023

Choose a reason for hiding this comment

codecov bot commented Jul 19, 2023 • edited Loading

Codecov Report

sijunhe left a comment

Choose a reason for hiding this comment

ZHUI Jul 25, 2023

Choose a reason for hiding this comment

ZHUI Jul 25, 2023

Choose a reason for hiding this comment

littsk Jul 26, 2023

Choose a reason for hiding this comment

ZHUI left a comment

Choose a reason for hiding this comment

FeixLiu left a comment

Choose a reason for hiding this comment

codecov bot commented Jul 19, 2023 •

edited

Loading