[MegatronBERT P0] add PretrainedConfig and unit test #4912

ZwhElliott · 2023-02-21T09:43:08Z

PR types

PR changes

Description

paddle-bot · 2023-02-21T09:43:12Z

Thanks for your contribution!

CLAassistant · 2023-02-21T09:43:19Z

All committers have signed the CLA.

sijunhe · 2023-02-21T11:29:11Z

您好，欢迎参加飞桨黑客松。这个PR是一个好的开始，后续需要您打开test_modeling和test_tokenizer单测

sijunhe

进展不错。您可以配好环境以后在本地使用pytest tests/transformers/megatronbert/test_modeling.py和pytest tests/transformers/megatronbert/test_tokenizer.py进行测试

ZwhElliott · 2023-02-23T04:46:49Z

您好，发现了tokenizer.py文件中的bug,megatronbert_tokenizer在继承bert_tokenizer时部分参数无法传递,会影响tokenizer的参数传递而造成test_tokenizer无法通过.
修改后则顺利通过.

sijunhe · 2023-02-23T05:18:11Z

paddlenlp/transformers/megatronbert/modeling.py

        self.self = MegatronBertSelfAttention(
-            num_attention_heads=num_attention_heads,
-            attention_probs_dropout_prob=attention_probs_dropout_prob,
-            max_position_embeddings=max_position_embeddings,
-            position_embedding_type=position_embedding_type,
+            num_attention_heads=config.num_attention_heads,
+            attention_probs_dropout_prob=config.attention_probs_dropout_prob,
+            max_position_embeddings=config.max_position_embeddings,
+            position_embedding_type=config.position_embedding_type,
+        )


这里应该是self.self = MegatronBertSelfAttention(config)

sijunhe · 2023-02-23T05:18:20Z

paddlenlp/transformers/megatronbert/modeling.py

        super(MegatronBertAttention, self).__init__()
-        self.layer_norm = nn.LayerNorm(hidden_size, epsilon=layer_norm_eps)
+        self.layer_norm = nn.LayerNorm(config.hidden_size, epsilon=layer_norm_eps)


config.layer_norm_eps

sijunhe · 2023-02-23T05:18:34Z

paddlenlp/transformers/megatronbert/modeling.py

+            max_position_embeddings=config.max_position_embeddings,
+            position_embedding_type=config.position_embedding_type,
+        )
+        self.output = MegatronBertSelfOutput(


self.output = MegatronBertSelfOutput(config)

sijunhe · 2023-02-23T05:18:49Z

paddlenlp/transformers/megatronbert/modeling.py

        self.attention = MegatronBertAttention(
-            hidden_size=hidden_size,
-            num_attention_heads=num_attention_heads,
-            hidden_dropout_prob=hidden_dropout_prob,
-            attention_probs_dropout_prob=attention_probs_dropout_prob,
-            max_position_embeddings=max_position_embeddings,
-            position_embedding_type=position_embedding_type,
+            hidden_size=config.hidden_size,
+            num_attention_heads=config.num_attention_heads,
+            hidden_dropout_prob=config.hidden_dropout_prob,
+            attention_probs_dropout_prob=config.attention_probs_dropout_prob,
+            max_position_embeddings=config.max_position_embeddings,
+            position_embedding_type=config.position_embedding_type,


self.attention = MegatronBertAttention(config)

sijunhe · 2023-02-23T05:18:57Z

paddlenlp/transformers/megatronbert/modeling.py

        )

-        self.layer_norm = nn.LayerNorm(hidden_size, epsilon=layer_norm_eps)
+        self.layer_norm = nn.LayerNorm(config.hidden_size, epsilon=layer_norm_eps)


config. layer_norm_eps

sijunhe · 2023-02-23T05:19:17Z

paddlenlp/transformers/megatronbert/modeling.py

        self.intermediate = MegatronBertIntermediate(
-            hidden_size=hidden_size, intermediate_size=intermediate_size, hidden_act=hidden_act
+            hidden_size=config.hidden_size, intermediate_size=config.intermediate_size, hidden_act=config.hidden_act
        )
        self.output = MegatronBertOutput(
-            intermediate_size, hidden_dropout_prob=hidden_dropout_prob, hidden_size=hidden_size
+            config.intermediate_size, hidden_dropout_prob=config.hidden_dropout_prob, hidden_size=config.hidden_size
        )


这里的输入参数统一都是只有config

sijunhe · 2023-02-23T05:19:37Z

paddlenlp/transformers/megatronbert/modeling.py

-        position_embedding_type=None,
-        num_hidden_layers=24,
-    ):
+    def __init__(self, config: MegatronBertConfig):
        super(MegatronBertEncoder, self).__init__()
        self.layer = nn.LayerList(
            [
                MegatronBertLayer(


输入参数应该是config

sijunhe · 2023-02-23T05:19:55Z

paddlenlp/transformers/megatronbert/modeling.py

-        self.initializer_range = initializer_range
+        self.num_hidden_layers = config.num_hidden_layers
+        self.pad_token_id = config.pad_token_id
+        self.initializer_range = config.initializer_range
        self.embeddings = MegatronBertEmbeddings(


输入参数应该是config

sijunhe · 2023-02-23T05:20:00Z

paddlenlp/transformers/megatronbert/modeling.py

+            type_vocab_size=config.type_vocab_size,
+            max_position_embeddings=config.max_position_embeddings,
+            hidden_dropout_prob=config.hidden_dropout_prob,
+            position_embedding_type=config.position_embedding_type,
        )
        self.encoder = MegatronBertEncoder(


输入参数应该是config

sijunhe · 2023-02-23T05:21:08Z

您好，发现了tokenizer.py文件中的bug,megatronbert_tokenizer在继承bert_tokenizer时部分参数无法传递,会影响tokenizer的参数传递而造成test_tokenizer无法通过. 修改后则顺利通过.

没问题。这个模型代码年久失修，有bug也是预期之中，在PR里一起修复提交即可

ZwhElliott · 2023-02-24T03:27:34Z

您好,这样是不是就完成了一个模型了?

codecov · 2023-02-24T03:28:18Z

Codecov Report

Merging #4912 (e2ba41a) into develop (8d0801e) will increase coverage by 1.23%.
The diff coverage is 98.49%.

@@             Coverage Diff             @@
##           develop    #4912      +/-   ##
===========================================
+ Coverage    46.32%   47.56%   +1.23%     
===========================================
  Files          448      453       +5     
  Lines        64616    65421     +805     
===========================================
+ Hits         29936    31116    +1180     
+ Misses       34680    34305     -375

Impacted Files	Coverage Δ
paddlenlp/transformers/megatronbert/tokenizer.py	`100.00% <ø> (+9.09%)`	⬆️
paddlenlp/transformers/megatronbert/modeling.py	`94.94% <98.14%> (+70.37%)`	⬆️
paddlenlp/transformers/__init__.py	`100.00% <100.00%> (ø)`
...ddlenlp/transformers/megatronbert/configuration.py	`100.00% <100.00%> (ø)`
paddlenlp/trainer/integrations.py	`68.22% <0.00%> (-18.37%)`	⬇️
paddlenlp/taskflow/text_classification.py	`72.15% <0.00%> (-2.85%)`	⬇️
paddlenlp/transformers/chineseclip/modeling.py	`82.94% <0.00%> (-2.54%)`	⬇️
paddlenlp/taskflow/task.py	`48.78% <0.00%> (-1.40%)`	⬇️
paddlenlp/transformers/ernie_vil/modeling.py	`76.36% <0.00%> (-0.94%)`	⬇️
paddlenlp/trainer/training_args.py	`69.28% <0.00%> (-0.75%)`	⬇️
... and 21 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

sijunhe

Nice work! 稍微修改一下我comment的小问题即可合入

sijunhe · 2023-02-24T03:55:12Z

paddlenlp/transformers/megatronbert/modeling.py

        self.decoder_weight = self.create_parameter(
-            shape=[vocab_size, hidden_size], dtype=self.transform.weight.dtype, is_bias=False
+            shape=[config.vocab_size, config.hidden_size], dtype=nn.Embedding(1, 1)._dtype, is_bias=False


这里dtype还是保持dtype=self.transform.weight.dtype

您好,为什么改了这一块是因为'MegatronBertPredictionHeadTransform' object has no attribute 'weight'
原始modeling文件中,MegatronBertPredictionHeadTransform没有weight参数

了解了，那就改成dtype=self.transform.dense.weight.dtype 试试

sijunhe · 2023-02-24T03:57:29Z

tests/transformers/megatronbert/test_modeling.py

+@parameterized_class(
+    ("return_dict", "use_labels"),
+    [
+        [False, False],
+        [False, True],
+        [True, False],
+        [True, True],
+    ],
+)


当前的模型并没有实现return_dict和labels, 所以parameterized_class这一段不需要，可以删除

sijunhe · 2023-02-24T03:58:49Z

paddlenlp/transformers/megatronbert/configuration.py

+        "megatronbert-cased": "http://bj.bcebos.com/paddlenlp/models/transformers/"
+        "megatron-bert/megatronbert-cased/model_state.pdparams",
+        "megatronbert-uncased": "http://bj.bcebos.com/paddlenlp/models/transformers/"
+        "megatron-bert/megatronbert-cased/model_state.pdparams",


Suggested change

"megatronbert-cased": "http://bj.bcebos.com/paddlenlp/models/transformers/"

"megatron-bert/megatronbert-cased/model_state.pdparams",

"megatronbert-uncased": "http://bj.bcebos.com/paddlenlp/models/transformers/"

"megatron-bert/megatronbert-cased/model_state.pdparams",

"megatronbert-cased": "http://bj.bcebos.com/paddlenlp/models/transformers/megatron-bert/megatronbert-cased/model_state.pdparams",

"megatronbert-uncased": "http://bj.bcebos.com/paddlenlp/models/transformers/megatron-bert/megatronbert-uncased/model_state.pdparams",

sijunhe · 2023-02-24T04:01:02Z

您好,这样是不是就完成了一个模型了?

非常接近了，稍微修改一下我comment的小问题即可合入

sijunhe

lgtm. 感谢您完成这个模型的升级！

paddle-bot bot added contributor status: proposed labels Feb 21, 2023

"commit megatronbert configuration"

d42445d

sijunhe added the hackathon label Feb 21, 2023

sijunhe changed the title ~~"commit megatronbert configuration"~~ [MegatronBERT P0] add PretrainedConfig and unit test Feb 21, 2023

sijunhe mentioned this pull request Feb 21, 2023

【飞桨黑客松】升级paddlenlp.transformers内的模型结构并且增加基础单测 #4714

Closed

12 tasks

ZwhElliott added 5 commits February 22, 2023 11:38

"tokenizer test"

62a4618

”add unit test“

43f69e6

"fix bug"

06ed263

"fix bug"

fbb0a12

”fix bug“

bdd9b8c

sijunhe reviewed Feb 22, 2023

View reviewed changes

sijunhe reviewed Feb 23, 2023

View reviewed changes

ZwhElliott added 4 commits February 23, 2023 17:26

"fix bug"

6d7f04b

fix bug

2773285

"fix bug"

423f733

"fix bug"

1be48b3

sijunhe reviewed Feb 24, 2023

View reviewed changes

ZwhElliott added 2 commits February 24, 2023 12:33

"fix"

e32acc8

retrigger checks

e2ba41a

sijunhe approved these changes Feb 24, 2023

View reviewed changes

sijunhe merged commit a27b4b1 into PaddlePaddle:develop Feb 24, 2023

sijunhe mentioned this pull request Feb 24, 2023

Robinbg add megatron bert config #4924

Closed

ZwhElliott mentioned this pull request Mar 8, 2023

【PaddlePaddle Hackathon 第四期】任务总览 PaddlePaddle/Paddle#51281

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MegatronBERT P0] add PretrainedConfig and unit test #4912

[MegatronBERT P0] add PretrainedConfig and unit test #4912

ZwhElliott commented Feb 21, 2023

paddle-bot bot commented Feb 21, 2023

CLAassistant commented Feb 21, 2023 •

edited

Loading

sijunhe commented Feb 21, 2023

sijunhe left a comment

ZwhElliott commented Feb 23, 2023

sijunhe Feb 23, 2023

sijunhe Feb 23, 2023

sijunhe Feb 23, 2023

sijunhe Feb 23, 2023

sijunhe Feb 23, 2023

sijunhe Feb 23, 2023

sijunhe Feb 23, 2023

sijunhe Feb 23, 2023

sijunhe Feb 23, 2023

sijunhe commented Feb 23, 2023

ZwhElliott commented Feb 24, 2023

codecov bot commented Feb 24, 2023 •

edited

Loading

sijunhe left a comment

sijunhe Feb 24, 2023

ZwhElliott Feb 24, 2023

sijunhe Feb 24, 2023

sijunhe Feb 24, 2023

sijunhe Feb 24, 2023

sijunhe commented Feb 24, 2023

sijunhe left a comment

[MegatronBERT P0] add PretrainedConfig and unit test #4912

[MegatronBERT P0] add PretrainedConfig and unit test #4912

Conversation

ZwhElliott commented Feb 21, 2023

PR types

PR changes

Description

paddle-bot bot commented Feb 21, 2023

CLAassistant commented Feb 21, 2023 • edited Loading

sijunhe commented Feb 21, 2023

sijunhe left a comment

Choose a reason for hiding this comment

ZwhElliott commented Feb 23, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sijunhe commented Feb 23, 2023

ZwhElliott commented Feb 24, 2023

codecov bot commented Feb 24, 2023 • edited Loading

Codecov Report

sijunhe left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sijunhe commented Feb 24, 2023

sijunhe left a comment

Choose a reason for hiding this comment

CLAassistant commented Feb 21, 2023 •

edited

Loading

codecov bot commented Feb 24, 2023 •

edited

Loading