[Numpy] Refactor Roberta #1269

zheyuye · 2020-07-17T08:02:28Z

Description

Refactor reoberta model following apache/mxnet#18717

Changes

Refactor reoberta and maintain consistency in model structure and external APIS with other pretrained models
Maintain consistency in the interface of tokenizers
Store the converted weights sperately for backbone and masked language model
Revise run_squad.py to support roberta fine-tuning

Comments

@sxjscience @hymzoque

codecov · 2020-07-17T09:31:53Z

Codecov Report

Merging #1269 into numpy will decrease coverage by 0.11%.
The diff coverage is 86.08%.

@@            Coverage Diff             @@
##            numpy    #1269      +/-   ##
==========================================
- Coverage   82.53%   82.42%   -0.12%     
==========================================
  Files          38       38              
  Lines        5446     5491      +45     
==========================================
+ Hits         4495     4526      +31     
- Misses        951      965      +14

Impacted Files	Coverage Δ
src/gluonnlp/models/mobilebert.py	`81.35% <ø> (ø)`
src/gluonnlp/models/roberta.py	`88.78% <84.09%> (-4.87%)`	⬇️
src/gluonnlp/models/xlmr.py	`86.88% <88.23%> (-1.12%)`	⬇️
src/gluonnlp/data/tokenizers.py	`77.71% <100.00%> (ø)`
src/gluonnlp/models/albert.py	`96.68% <100.00%> (+0.01%)`	⬆️
src/gluonnlp/models/bert.py	`84.42% <100.00%> (ø)`
src/gluonnlp/models/electra.py	`78.94% <100.00%> (+0.07%)`	⬆️

leezu · 2020-07-17T17:43:07Z

It's unclear what part of the PR is related to apache/mxnet#18717 and what part are unrelated changes. I suggest to focus on making the apache/mxnet#18717 feature available in MXNet instead of making the GluonNLP code more complex to workaround the missing feature. What do you think?

Is there any overlap between the weights "Store the converted weights sperately for backbone and masked language model"?

scripts/question_answering/models.py

leezu

As you cite apache/mxnet#18717 could you please check if apache/mxnet#18749 addresses the feature request and if this PR should be adapted? Thanks

zheyuye · 2020-07-18T07:20:31Z

@leezu There is a huge overlap between 'model.param' and 'model-mlm.params' which could be eliminated by apache/mxnet#18749 by only storing 'model-mlm.params' and load it with ignore_extra=True for backbone model. It can be done in a separate PR for roberta as well as other pretrain models stored in sperate parameters if this solution is reasonable for you.

In addition, this PR also solves the problem that the current roberta model does not handle MLM task properly. That is, the MLM model takes the same input (input_ids, valid_length) as backbone without marking which positions are masked as

gluon-nlp/src/gluonnlp/models/bert.py

Line 476 in e78a24e

    
           mlm_features = select_vectors_by_position(F, contextual_embeddings, masked_positions)

and offical implementation

leezu

Blocked by apache/mxnet#18749

leezu · 2020-07-20T18:20:00Z

Thanks @zheyuye. Yes, I recommend to adding features in the MXNet side instead of adopting a problematic workaround for GluonNLP

zheyuye · 2020-07-20T18:26:04Z

Thanks @zheyuye. Yes, I recommend to adding features in the MXNet side instead of adopting a problematic workaround for GluonNLP

Alright, I am going to combine backbone and it's mlm models together in this PR once
apache/mxnet#18749 merged

scripts/question_answering/README.md

src/gluonnlp/models/roberta.py

sxjscience · 2020-07-20T18:55:13Z

@leezu Why is it blocked by apache/mxnet#18749 ?

leezu

@sxjscience it adds a workaround for a missing feature in MXNet. We should improve MXNet serialization format instead of adding workarounds. In any case, let's add the workaround now and remove it later again.

zheyuye added 6 commits July 17, 2020 14:07

update roberta

dc2801b

fix

9134a1d

lowercase

3044891

update squad

5f3a9b5

dtype

a9d180b

tiny fix

e4b0efb

zheyuye added bug Something isn't working numpyrefactor labels Jul 17, 2020

zheyuye force-pushed the roberta branch from ff5a711 to 1a09fee Compare July 17, 2020 08:17

fix

cf1b9fe

zheyuye force-pushed the roberta branch from 1a09fee to cf1b9fe Compare July 17, 2020 09:25

sxjscience reviewed Jul 17, 2020

View reviewed changes

scripts/question_answering/models.py Show resolved Hide resolved

szha requested a review from leezu July 17, 2020 19:25

leezu reviewed Jul 18, 2020

View reviewed changes

zheyuye added 2 commits July 18, 2020 15:33

describe use_segmentation

3f14798

pass mlm model testing

7baf67d

leezu suggested changes Jul 20, 2020

View reviewed changes

sxjscience reviewed Jul 20, 2020

View reviewed changes

scripts/question_answering/README.md Outdated Show resolved Hide resolved

sxjscience reviewed Jul 20, 2020

View reviewed changes

scripts/question_answering/README.md Outdated Show resolved Hide resolved

sxjscience reviewed Jul 20, 2020

View reviewed changes

src/gluonnlp/models/roberta.py Show resolved Hide resolved

leezu approved these changes Jul 20, 2020

View reviewed changes

fix README

c4dff18

sxjscience approved these changes Jul 20, 2020

View reviewed changes

fix

b00a487

sxjscience merged commit 3a0ed9f into dmlc:numpy Jul 21, 2020

zheyuye deleted the roberta branch July 21, 2020 07:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Numpy] Refactor Roberta #1269

[Numpy] Refactor Roberta #1269

zheyuye commented Jul 17, 2020

codecov bot commented Jul 17, 2020 •

edited

Loading

leezu commented Jul 17, 2020

leezu left a comment

zheyuye commented Jul 18, 2020 •

edited

Loading

leezu left a comment

leezu commented Jul 20, 2020

zheyuye commented Jul 20, 2020

sxjscience commented Jul 20, 2020

leezu left a comment

[Numpy] Refactor Roberta #1269

[Numpy] Refactor Roberta #1269

Conversation

zheyuye commented Jul 17, 2020

Description

Changes

Comments

codecov bot commented Jul 17, 2020 • edited Loading

Codecov Report

leezu commented Jul 17, 2020

leezu left a comment

Choose a reason for hiding this comment

zheyuye commented Jul 18, 2020 • edited Loading

leezu left a comment

Choose a reason for hiding this comment

leezu commented Jul 20, 2020

zheyuye commented Jul 20, 2020

sxjscience commented Jul 20, 2020

leezu left a comment

Choose a reason for hiding this comment

codecov bot commented Jul 17, 2020 •

edited

Loading

zheyuye commented Jul 18, 2020 •

edited

Loading