[LLM Inference] support llama3.1 #8929

yuanlehome · 2024-08-14T05:28:15Z

PR types

New features

PR changes

Others

Description

support llama3.1 --block_attn 支持多卡推理

paddle-bot · 2024-08-14T05:28:20Z

Thanks for your contribution!

codecov · 2024-08-14T06:01:42Z

Codecov Report

Attention: Patch coverage is 0% with 27 lines in your changes missing coverage. Please review.

Project coverage is 55.11%. Comparing base (0cc8554) to head (3a1103a).
Report is 231 commits behind head on develop.

Files with missing lines	Patch %	Lines
...dlenlp/experimental/transformers/llama/modeling.py	0.00%	26 Missing ⚠️
...enlp/experimental/transformers/generation_utils.py	0.00%	1 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff             @@
##           develop    #8929      +/-   ##
===========================================
+ Coverage    55.05%   55.11%   +0.06%     
===========================================
  Files          635      635              
  Lines        99410    99548     +138     
===========================================
+ Hits         54729    54870     +141     
+ Misses       44681    44678       -3

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

DesmonDay · 2024-08-14T09:43:00Z

paddlenlp/experimental/transformers/llama/modeling.py

-                self.vocab_size,
-                self.hidden_size,
-            )
+        self.embed_tokens = nn.Embedding(self.vocab_size, self.hidden_size)


多卡推理遇到两个问题，1. block_attn为false时动转静会报错，2. block_attn为true时运行会报错

DesmonDay · 2024-08-14T09:47:43Z

建议后续针对 Embedding TP切分的推理报错进行修复，而不是直接改成不切分。 @yuanlehome

DesmonDay

LGTM

yuanlehome · 2024-08-14T09:49:30Z

建议后续针对 Embedding TP切分的推理报错进行修复，而不是直接改成不切分。 @yuanlehome

收到，已记todo

support llama3.1

fc034cd

update

3a1103a

DesmonDay reviewed Aug 14, 2024

View reviewed changes

DesmonDay approved these changes Aug 14, 2024

View reviewed changes

wawltor merged commit 3fff378 into PaddlePaddle:develop Aug 14, 2024
9 of 12 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[LLM Inference] support llama3.1 #8929

[LLM Inference] support llama3.1 #8929

yuanlehome commented Aug 14, 2024 •

edited

Loading

paddle-bot bot commented Aug 14, 2024

codecov bot commented Aug 14, 2024 •

edited

Loading

DesmonDay Aug 14, 2024

yuanlehome Aug 14, 2024

DesmonDay commented Aug 14, 2024

DesmonDay left a comment

yuanlehome commented Aug 14, 2024

[LLM Inference] support llama3.1 #8929

[LLM Inference] support llama3.1 #8929

Conversation

yuanlehome commented Aug 14, 2024 • edited Loading

PR types

PR changes

Description

paddle-bot bot commented Aug 14, 2024

codecov bot commented Aug 14, 2024 • edited Loading

Codecov Report

DesmonDay Aug 14, 2024

Choose a reason for hiding this comment

yuanlehome Aug 14, 2024

Choose a reason for hiding this comment

DesmonDay commented Aug 14, 2024

DesmonDay left a comment

Choose a reason for hiding this comment

yuanlehome commented Aug 14, 2024

yuanlehome commented Aug 14, 2024 •

edited

Loading

codecov bot commented Aug 14, 2024 •

edited

Loading