New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Fix the bug when using 0-D tensor in MoE model #5538

Merged

ZHUI merged 1 commit into PaddlePaddle:develop from pkuzyc:develop

May 9, 2023

Contributor

pkuzyc commented Apr 5, 2023 •

edited

Loading

PR types

Others

PR changes

Others

Description

Fix the bug when using 0-D tensor in MoE model

paddle-bot bot commented Apr 5, 2023

Thanks for your contribution!

pkuzyc changed the title ~~fix the bug in lr_scheduler init and fix the diff of GPT model in aut…~~ [AutoParallel] fix the diff in lr_scheduler and GPT model

pkuzyc force-pushed the develop branch from c311777 to c5cc9e0 Compare

April 6, 2023 03:18

codecov bot commented Apr 6, 2023 •

edited

Loading

Codecov Report

Merging #5538 (aa4cedc) into develop (b7246e1) will decrease coverage by 2.52%.
The diff coverage is n/a.

❗ Current head aa4cedc differs from pull request most recent head 67389b6. Consider uploading reports for the commit 67389b6 to get more accurate results

@@             Coverage Diff             @@
##           develop    #5538      +/-   ##
===========================================
- Coverage    61.94%   59.43%   -2.52%     
===========================================
  Files          491      482       -9     
  Lines        69118    68103    -1015     
===========================================
- Hits         42817    40475    -2342     
- Misses       26301    27628    +1327

see 123 files with indirect coverage changes

zhaoyinglia reviewed

View reviewed changes

model_zoo/gpt-3/ppfleetx/configs/nlp/gpt/auto/pretrain_gpt_base.yaml Outdated

    
                fuse_attn_qkv: False

                fused_linear: False

                fuse_attn_qkv: True

                scale_qk_by_layer_num: True

Collaborator

zhaoyinglia Apr 6, 2023

确认哪些配置在自动并行中是确实可配置的，再决定是否添加。

Contributor Author

pkuzyc Apr 6, 2023

sequence_parallel没有使用，删除了，组网中的参数也删除了

zhaoyinglia reviewed

View reviewed changes

model_zoo/gpt-3/ppfleetx/configs/nlp/gpt/auto/pretrain_gpt_345M_single_card.yaml Outdated

    
            @@ -2,8 +2,8 @@ _base_: ./pretrain_gpt_base.yaml
          
              Global:

                global_batch_size: 

                local_batch_size: 8

                micro_batch_size: 8

                local_batch_size: 4

Collaborator

zhaoyinglia Apr 6, 2023

尽可能在 bash 脚本中通过 -o Global.local_batch_size 的方式来配置，不修改yaml。

Contributor Author

pkuzyc Apr 6, 2023

yaml已经修改回去了，修改了projects/gpt里的运行脚本来修改配置

zhaoyinglia reviewed

View reviewed changes

model_zoo/gpt-3/ppfleetx/configs/nlp/gpt/pretrain_gpt_345M_single_card.yaml Outdated

    
            @@ -2,8 +2,8 @@ _base_: ./pretrain_gpt_base.yaml
          
              Global:

                global_batch_size: 

                local_batch_size: 8

                micro_batch_size: 8

                local_batch_size: 4

Collaborator

zhaoyinglia Apr 6, 2023

同上

zhaoyinglia reviewed

View reviewed changes

model_zoo/gpt-3/ppfleetx/configs/nlp/gpt/auto/pretrain_gpt_base.yaml Outdated

    
                    max_seq_len: 1024

                  sampler:

Collaborator

zhaoyinglia Apr 6, 2023

自动并行不需要sampler 配置。也不需要 loader 配置。

Contributor Author

pkuzyc Apr 6, 2023

删除了sampler和loader

pkuzyc force-pushed the develop branch from c5cc9e0 to c5fe225 Compare

April 6, 2023 08:56

zhaoyinglia reviewed

View reviewed changes

model_zoo/gpt-3/ppfleetx/configs/nlp/gpt/auto/pretrain_gpt_base.yaml Outdated

    
                Eval:

                  collate_fn: gpt_collate_fn

                  sample_split: 2

Collaborator

zhaoyinglia Apr 6, 2023

以上两行不能删

Contributor Author

pkuzyc Apr 6, 2023

补回去了

pkuzyc force-pushed the develop branch from ebc3cfc to deb0ba0 Compare

April 6, 2023 10:11

zhaoyinglia reviewed

View reviewed changes

model_zoo/gpt-3/ppfleetx/models/language_model/gpt/auto/auto_model.py Outdated

    
                      if self.use_recompute and self.recompute_granularity == "core_attn":

                          out, weights = auto.recompute(self.core_attn)(q, k, v, attn_mask=attn_mask)

                      if self.use_recompute and self.recompute_granularity == "core_attn" and self.do_recompute:

                          out, weights = recompute(self.core_attn, q, k, v, attn_mask)

Collaborator

zhaoyinglia Apr 6, 2023

自动并行的 recompute 接口和动态图的不一样。这里不能改。

Contributor Author

pkuzyc Apr 6, 2023

改成自动并行接口了

zhaoyinglia reviewed

View reviewed changes

model_zoo/gpt-3/ppfleetx/models/language_model/gpt/auto/auto_model.py Outdated

    
            @@ -1004,7 +1137,6 @@ def _post_process_(outputs, input_ids, cur_len, origin_len, scores, unfinished_f
          
                      # make the shape of attention_mask = (-1, -1, -1, -1) in dy2static.

                      model_kwargs["attention_mask"] = paddle.reshape(attn_mask, paddle.shape(attn_mask))

                      model_kwargs["cache"] = outputs[1] if isinstance(outputs, tuple) else None

                      max_length = paddle.to_tensor(max_length)

Collaborator

zhaoyinglia Apr 6, 2023

这行不能删。动转静需要。

Contributor Author

pkuzyc Apr 6, 2023

补回去了

zhaoyinglia reviewed

View reviewed changes

model_zoo/gpt-3/ppfleetx/models/language_model/gpt/auto/auto_model.py Outdated

    
                          # early finish should be True in generation scenes,

                          # If users want to test the inference speed, you can just set it False.

                          if self.early_finish and not paddle.any(unfinished_flag):

                          if not paddle.any(unfinished_flag):

                              break

Collaborator

zhaoyinglia Apr 6, 2023

这里也不能改。

Contributor Author

pkuzyc Apr 6, 2023

改回去了

zhaoyinglia reviewed

View reviewed changes

model_zoo/gpt-3/ppfleetx/models/language_model/gpt/auto/auto_model.py Outdated

    
                      loss_mask = loss_mask.reshape([-1])

                      masked_lm_loss = paddle.sum(masked_lm_loss.reshape([-1]) * loss_mask)

                      loss = masked_lm_loss / loss_mask.sum()

                      return loss

              class GPTForSequenceClassification(nn.Layer):

Collaborator

zhaoyinglia Apr 6, 2023

自动并行还没有支持这个任务。可删除。

Contributor Author

pkuzyc Apr 6, 2023

已删除。

zhaoyinglia reviewed

View reviewed changes

model_zoo/gpt-3/ppfleetx/models/language_model/gpt/auto/auto_model.py Outdated

    
            @@ -620,7 +723,7 @@ class GPTPretrainingCriterionAuto(nn.Layer):
          
                  Criterion for GPT. It calculates the final loss.

                  """

                  def __init__(self, mesh):

                  def __init__(self, mesh, topo=None):

Collaborator

zhaoyinglia Apr 6, 2023

topo不需要了

Contributor Author

pkuzyc Apr 6, 2023

已删除

pkuzyc force-pushed the develop branch 2 times, most recently from 1c79fd7 to 2ed0e71 Compare

April 7, 2023 09:52


          fix the bug when using 0-D tensor in MoE model

67389b6

pkuzyc force-pushed the develop branch from aa4cedc to 67389b6 Compare

May 9, 2023 08:52

pkuzyc changed the title ~~[AutoParallel] fix the diff in lr_scheduler and GPT model~~ Fix the bug when using 0-D tensor in MoE model

ZHUI reviewed

View reviewed changes

Collaborator

ZHUI left a comment

LGTM

fightfat approved these changes

View reviewed changes

ZHUI approved these changes

View reviewed changes

Collaborator

ZHUI left a comment

LGTM

ZHUI approved these changes

View reviewed changes

Collaborator

ZHUI left a comment

LGTM

ZHUI approved these changes

View reviewed changes

Collaborator

ZHUI left a comment

LGTM

ZHUI approved these changes

View reviewed changes

Collaborator

ZHUI left a comment

LGTM

ZHUI merged commit 80cc859 into PaddlePaddle:develop

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet