-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[INFER][LLM] Support qwen in fined grained dybatch v1 #7644
Conversation
Thanks for your contribution! |
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## develop #7644 +/- ##
===========================================
- Coverage 57.48% 57.12% -0.37%
===========================================
Files 583 587 +4
Lines 87187 88190 +1003
===========================================
+ Hits 50123 50376 +253
- Misses 37064 37814 +750 ☔ View full report in Codecov by Sentry. |
9e07b6d
to
87caabd
Compare
从 PaddleNLP-CI 的日志中看来是精度没有对齐 |
|
qwen 推理
#!/bin/bash
python3 predictor.py \
--model_name_or_path qwen/qwen-7b \
--decode_strategy greedy_search \
--batch_size 1 \
--dtype float16
#!/bin/bash
python3 predictor.py \
--model_name_or_path qwen/qwen-7b \
--decode_strategy greedy_search \
--batch_size 2 \
--dtype float16 qwen inference model 推理
#!/bin/bash
python3 predictor.py \
--model_name_or_path qwen/qwen-7b \
--decode_strategy greedy_search \
--batch_size 1 \
--inference_model \
--dtype float16
#!/bin/bash
python3 predictor.py \
--model_name_or_path qwen/qwen-7b \
--decode_strategy greedy_search \
--batch_size 2 \
--inference_model \
--dtype float16 cc @wj-Mcat |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
PR types
PR changes
Description