-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix pad_token_id bug #8814
fix pad_token_id bug #8814
Conversation
Thanks for your contribution! |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #8814 +/- ##
===========================================
- Coverage 55.52% 55.51% -0.02%
===========================================
Files 630 631 +1
Lines 98365 99128 +763
===========================================
+ Hits 54619 55032 +413
- Misses 43746 44096 +350 ☔ View full report in Codecov by Sentry. |
@@ -1270,7 +1272,7 @@ def create_predictor( | |||
|
|||
# TODO(wj-Mcat): fix llama tokenzier pad_token bug | |||
if (isinstance(tokenizer, (LlamaTokenizer, Llama3Tokenizer))) and not tokenizer.pad_token: | |||
tokenizer.pad_token = tokenizer.bos_token | |||
tokenizer.pad_token = tokenizer.eos_token |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logger.warning( | ||
"Setting `pad_token_id` to `eos_token_id`:{} for " "open-end generation.".format(eos_token_id) | ||
) | ||
if isinstance(eos_token_id, list): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
PR types
Bug fixes
PR changes
Models
Description
fix pad_token_id bug.