why squad.py did not reproduce squad1.1 report result? #4301

yyHaker · 2020-05-12T02:15:48Z

📚 Migration

Information

Model I am using (Bert, XLNet ...):

Language I am using the model on (English...):

The problem arises when using:

the official example scripts: (give details below)
examples/question-answering/run_squad.py
my own modified scripts: (give details below)
'''
CUDA_VISIBLE_DEVICES=5 python examples/question-answering//run_squad.py
--model_type bert
--model_name_or_path bert-large-uncased-whole-word-masking
--do_train
--do_eval
--data_dir EKMRC/data/squad1.1
--train_file train-v1.1.json
--predict_file dev-v1.1.json
--per_gpu_eval_batch_size=4
--per_gpu_train_batch_size=4
--gradient_accumulation_steps=6
--save_steps 3682
--learning_rate 3e-5
--num_train_epochs 2
--max_seq_length 384
--doc_stride 128
--output_dir result/debug_squad/wwm_uncased_bert_large_finetuned_squad/
--overwrite_output_dir
'''

The tasks I am working on is:

an official GLUE/SQUaD task: (give the name)

Details

But I did not reproduce the result reported, the repository say get result bellow:

python $SQUAD_DIR/evaluate-v1.1.py $SQUAD_DIR/dev-v1.1.json ../models/wwm_uncased_finetuned_squad/predictions.json
{"exact_match": 86.91579943235573, "f1": 93.1532499015869}

my result is below:

python $SQUAD_DIR/evaluate-v1.1.py $SQUAD_DIR/dev-v1.1.json ../models/wwm_uncased_finetuned_squad/predictions.json
{"exact_match": 81.03, "f1": 88.02}

Environment info

transformers version:
Platform: Linux gpu19 3.10.0-1062.4.1.el7.x86_64 Create DataParallel model if several GPUs #1 SMP Fri Oct 18 17:15:30 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Python version: python3.6
PyTorch version (GPU?): 1.4.0
Using GPU in script?: yes
Using distributed or parallel set-up in script?: parallel

pytorch-transformers or pytorch-pretrained-bert version (or branch):
current version of transformers.

Checklist

[yes ] I have read the migration guide in the readme.
(pytorch-transformers;

The text was updated successfully, but these errors were encountered:

MagicFrogSJTU · 2020-05-12T06:27:00Z

I have just solved this problem.
You have to set an additional flag: --do_lower_case.
I wonder why the run_squad.py behaves differently than run_glue.py, etc. Is there is a code improve on the way?

LysandreJik · 2020-05-12T20:52:24Z

You shouldn't have to set --do_lower_case as it should be lowercased by default for that model.

MagicFrogSJTU · 2020-05-14T13:16:17Z

You shouldn't have to set --do_lower_case as it should be lowercased by default for that model.

I thought it is and it should be, but it isn't

julien-c · 2020-05-27T01:23:04Z

Closing this b/c #4245 was merged

(we still need to investigate why the lowercasing is not properly populated by the model's config)

yyHaker added the Migration label May 12, 2020

julien-c closed this as completed May 27, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

why squad.py did not reproduce squad1.1 report result? #4301

why squad.py did not reproduce squad1.1 report result? #4301

yyHaker commented May 12, 2020 •

edited

Loading

MagicFrogSJTU commented May 12, 2020

LysandreJik commented May 12, 2020

MagicFrogSJTU commented May 14, 2020

julien-c commented May 27, 2020

why squad.py did not reproduce squad1.1 report result? #4301

why squad.py did not reproduce squad1.1 report result? #4301

Comments

yyHaker commented May 12, 2020 • edited Loading

📚 Migration

Information

Details

Environment info

Checklist

MagicFrogSJTU commented May 12, 2020

LysandreJik commented May 12, 2020

MagicFrogSJTU commented May 14, 2020

julien-c commented May 27, 2020

yyHaker commented May 12, 2020 •

edited

Loading