-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix llama3 static run #8849
fix llama3 static run #8849
Conversation
Thanks for your contribution! |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## develop #8849 +/- ##
===========================================
+ Coverage 53.81% 53.87% +0.06%
===========================================
Files 652 652
Lines 104356 104356
===========================================
+ Hits 56155 56220 +65
+ Misses 48201 48136 -65 ☔ View full report in Codecov by Sentry. |
e63c9b6
to
02680fa
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
paddlenlp/generation/utils.py
Outdated
@@ -1407,6 +1407,7 @@ def _post_process_( | |||
# compute next_tokens | |||
if use_top_p: | |||
logits = logits / temperature | |||
probs = paddle.cast(probs, paddle.float32) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
是不是在支持下top_p_sampling在bf16算子的支持?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bf16算子kernel实现上是支持的,这里不cast成fp32是因为会报错
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
具体报错原因是什么?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
我又确认了下,确实是kernel没有注册bf16,这个我在paddle侧支持了,所以这里添加cast的逻辑已移除,验证也没有问题
02680fa
to
10d3e95
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
PR types
Bug fixes
PR changes
Others
Description
修复llama3散op静态图推理的一系列问题,精度正常