xpu devices support llama-7b basic mode inference (turn on BlockAtten… #8588

zhink · 2024-06-12T07:45:20Z

…tion)

PR types

New features

PR changes

Others

Description

xpu devices support llama-7b basic mode inference (turn on BlockAttention)

paddle-bot · 2024-06-12T07:45:25Z

Thanks for your contribution!

…tion)

codecov · 2024-06-12T08:17:11Z

Codecov Report

Attention: Patch coverage is 0% with 31 lines in your changes missing coverage. Please review.

Project coverage is 54.41%. Comparing base (525eef7) to head (8717c5e).
Report is 239 commits behind head on develop.

Files with missing lines	Patch %	Lines
...erimental/transformers/fused_transformer_layers.py	0.00%	11 Missing ⚠️
...enlp/experimental/transformers/generation_utils.py	0.00%	9 Missing ⚠️
...dlenlp/experimental/transformers/llama/modeling.py	0.00%	3 Missing ⚠️
...dlenlp/experimental/transformers/bloom/modeling.py	0.00%	2 Missing ⚠️
...ddlenlp/experimental/transformers/qwen/modeling.py	0.00%	2 Missing ⚠️
...enlp/experimental/transformers/chatglm/modeling.py	0.00%	1 Missing ⚠️
...p/experimental/transformers/chatglm_v2/modeling.py	0.00%	1 Missing ⚠️
...addlenlp/experimental/transformers/gpt/modeling.py	0.00%	1 Missing ⚠️
...addlenlp/experimental/transformers/opt/modeling.py	0.00%	1 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff             @@
##           develop    #8588      +/-   ##
===========================================
- Coverage    54.42%   54.41%   -0.02%     
===========================================
  Files          632      632              
  Lines        99451    99470      +19     
===========================================
  Hits         54129    54129              
- Misses       45322    45341      +19

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

ZHUI · 2024-06-12T12:14:05Z

paddlenlp/experimental/transformers/fused_transformer_layers.py

@@ -25,27 +25,27 @@
 from paddle.nn import Layer
 from paddle.nn.initializer import Constant
 from paddle.nn.quant import weight_only_linear
+from paddlenlp_ops import rebuild_padding_v2


额，这力直接 import了，后面才判断 is_paddlenlp_ops_available ? 已经晚了吧

zhink force-pushed the xpullama2 branch from a059452 to ed186fc Compare June 12, 2024 07:54

xpu devices support llama-7b basic mode inference (turn on BlockAtten…

81d7e07

…tion)

zhink force-pushed the xpullama2 branch from ed186fc to 81d7e07 Compare June 12, 2024 07:54

vivienfanghuagood approved these changes Jun 12, 2024

View reviewed changes

Merge commit '525eef76f0513f205f5d6e6122cdb53f52505386' into xpullama2

8d67d0f

zhink force-pushed the xpullama2 branch from 9250d64 to 8d67d0f Compare June 12, 2024 10:04

ZHUI reviewed Jun 12, 2024

View reviewed changes

zhink force-pushed the xpullama2 branch from 3f2e711 to 51cc887 Compare June 12, 2024 12:22

fix codestyle

8717c5e

zhink force-pushed the xpullama2 branch from 51cc887 to 8717c5e Compare June 13, 2024 02:13

zhink requested a review from ZHUI June 13, 2024 03:04

ZHUI merged commit 3d777c1 into PaddlePaddle:develop Jun 13, 2024
7 of 12 checks passed

zhink deleted the xpullama2 branch June 13, 2024 03:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

xpu devices support llama-7b basic mode inference (turn on BlockAtten… #8588

xpu devices support llama-7b basic mode inference (turn on BlockAtten… #8588

zhink commented Jun 12, 2024

paddle-bot bot commented Jun 12, 2024

codecov bot commented Jun 12, 2024 •

edited

Loading

ZHUI Jun 12, 2024

zhink Jun 12, 2024

xpu devices support llama-7b basic mode inference (turn on BlockAtten… #8588

xpu devices support llama-7b basic mode inference (turn on BlockAtten… #8588

Conversation

zhink commented Jun 12, 2024

PR types

PR changes

Description

paddle-bot bot commented Jun 12, 2024

codecov bot commented Jun 12, 2024 • edited Loading

Codecov Report

ZHUI Jun 12, 2024

Choose a reason for hiding this comment

zhink Jun 12, 2024

Choose a reason for hiding this comment

codecov bot commented Jun 12, 2024 •

edited

Loading