Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use our own Linear layer for easier tie_weights; Fix resize_token_embeddings API #5623

Merged
merged 13 commits into from
Apr 13, 2023

Conversation

sijunhe
Copy link
Collaborator

@sijunhe sijunhe commented Apr 12, 2023

PR types

PR changes

Description

@sijunhe sijunhe requested review from gongel and JunnYu April 12, 2023 05:29
@paddle-bot
Copy link

paddle-bot bot commented Apr 12, 2023

Thanks for your contribution!

@sijunhe sijunhe changed the title linear linear Use our own Linear layer for easier tie_weights Apr 12, 2023
@sijunhe sijunhe changed the title Use our own Linear layer for easier tie_weights Use our own Linear layer for easier tie_weights; Fix resize_token_embeddings API Apr 12, 2023
paddlenlp/layers/linear.py Outdated Show resolved Hide resolved
@@ -94,12 +94,14 @@ def __init__(
layer_norm_epsilon: float = 1e-05,
initializer_range: float = 0.02,
n_inner: int = None,
tie_word_embeddings: bool = False,
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

一般模型都是默认tie_weights为True, 而codegen的tie_word_embeddings是false的,所以这里特定指定一下,要不然resize_word_embeddings会出问题

@codecov
Copy link

codecov bot commented Apr 13, 2023

Codecov Report

Merging #5623 (cacf36c) into develop (8bea1a9) will decrease coverage by 0.03%.
The diff coverage is 92.75%.

@@             Coverage Diff             @@
##           develop    #5623      +/-   ##
===========================================
- Coverage    59.47%   59.45%   -0.03%     
===========================================
  Files          482      483       +1     
  Lines        68105    68187      +82     
===========================================
+ Hits         40506    40541      +35     
- Misses       27599    27646      +47     
Impacted Files Coverage Δ
paddlenlp/transformers/codegen/configuration.py 100.00% <ø> (ø)
paddlenlp/transformers/llama/configuration.py 100.00% <ø> (ø)
paddlenlp/transformers/gpt/modeling.py 77.82% <71.42%> (-0.25%) ⬇️
paddlenlp/transformers/model_utils.py 51.10% <90.00%> (-0.53%) ⬇️
paddlenlp/layers/__init__.py 100.00% <100.00%> (ø)
paddlenlp/layers/linear.py 100.00% <100.00%> (ø)
paddlenlp/prompt/verbalizer.py 89.93% <100.00%> (-0.21%) ⬇️
paddlenlp/transformers/albert/modeling.py 85.57% <100.00%> (+0.03%) ⬆️
paddlenlp/transformers/artist/modeling.py 90.90% <100.00%> (ø)
paddlenlp/transformers/bert/modeling.py 89.91% <100.00%> (+0.02%) ⬆️
... and 3 more

... and 10 files with indirect coverage changes

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

Copy link
Member

@JunnYu JunnYu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@gongel gongel merged commit aef101a into PaddlePaddle:develop Apr 13, 2023
@sijunhe sijunhe deleted the linear branch April 13, 2023 11:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants