-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use our own Linear layer for easier tie_weights; Fix resize_token_embeddings API #5623
Conversation
Thanks for your contribution! |
@@ -94,12 +94,14 @@ def __init__( | |||
layer_norm_epsilon: float = 1e-05, | |||
initializer_range: float = 0.02, | |||
n_inner: int = None, | |||
tie_word_embeddings: bool = False, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
一般模型都是默认tie_weights为True, 而codegen的tie_word_embeddings是false的,所以这里特定指定一下,要不然resize_word_embeddings会出问题
Codecov Report
@@ Coverage Diff @@
## develop #5623 +/- ##
===========================================
- Coverage 59.47% 59.45% -0.03%
===========================================
Files 482 483 +1
Lines 68105 68187 +82
===========================================
+ Hits 40506 40541 +35
- Misses 27599 27646 +47
... and 10 files with indirect coverage changes Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
PR types
PR changes
Description