Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

why squad.py did not reproduce squad1.1 report result? #4301

Closed
3 tasks done
yyHaker opened this issue May 12, 2020 · 4 comments
Closed
3 tasks done

why squad.py did not reproduce squad1.1 report result? #4301

yyHaker opened this issue May 12, 2020 · 4 comments

Comments

@yyHaker
Copy link

yyHaker commented May 12, 2020

📚 Migration

Information

Model I am using (Bert, XLNet ...):

Language I am using the model on (English...):

The problem arises when using:

  • the official example scripts: (give details below)
    examples/question-answering/run_squad.py
  • my own modified scripts: (give details below)
    '''
    CUDA_VISIBLE_DEVICES=5 python examples/question-answering//run_squad.py
    --model_type bert
    --model_name_or_path bert-large-uncased-whole-word-masking
    --do_train
    --do_eval
    --data_dir EKMRC/data/squad1.1
    --train_file train-v1.1.json
    --predict_file dev-v1.1.json
    --per_gpu_eval_batch_size=4
    --per_gpu_train_batch_size=4
    --gradient_accumulation_steps=6
    --save_steps 3682
    --learning_rate 3e-5
    --num_train_epochs 2
    --max_seq_length 384
    --doc_stride 128
    --output_dir result/debug_squad/wwm_uncased_bert_large_finetuned_squad/
    --overwrite_output_dir
    '''

The tasks I am working on is:

  • an official GLUE/SQUaD task: (give the name)

Details

But I did not reproduce the result reported, the repository say get result bellow:

python $SQUAD_DIR/evaluate-v1.1.py $SQUAD_DIR/dev-v1.1.json ../models/wwm_uncased_finetuned_squad/predictions.json
{"exact_match": 86.91579943235573, "f1": 93.1532499015869}

my result is below:

python $SQUAD_DIR/evaluate-v1.1.py $SQUAD_DIR/dev-v1.1.json ../models/wwm_uncased_finetuned_squad/predictions.json
{"exact_match": 81.03, "f1": 88.02}

Environment info

  • transformers version:
  • Platform: Linux gpu19 3.10.0-1062.4.1.el7.x86_64 Create DataParallel model if several GPUs #1 SMP Fri Oct 18 17:15:30 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
  • Python version: python3.6
  • PyTorch version (GPU?): 1.4.0
  • Using GPU in script?: yes
  • Using distributed or parallel set-up in script?: parallel
  • pytorch-transformers or pytorch-pretrained-bert version (or branch):
    current version of transformers.

Checklist

@MagicFrogSJTU
Copy link

I have just solved this problem.
You have to set an additional flag: --do_lower_case.
I wonder why the run_squad.py behaves differently than run_glue.py, etc. Is there is a code improve on the way?

@LysandreJik
Copy link
Member

You shouldn't have to set --do_lower_case as it should be lowercased by default for that model.

@MagicFrogSJTU
Copy link

You shouldn't have to set --do_lower_case as it should be lowercased by default for that model.

I thought it is and it should be, but it isn't

@julien-c
Copy link
Member

Closing this b/c #4245 was merged

(we still need to investigate why the lowercasing is not properly populated by the model's config)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants