NOPE: A Corpus of Naturally-Occurring Presuppositions in English

This repository hosts the corpus described in NOPE: A Corpus of Naturally-Occurring Presuppositions in English, as well as the raw data from the human and model experiments.

Downloading the corpus

nope-v1.zip

This archive contains the annotated main corpus (2,386 examples) and the corpus of adversarial examples (346 examples).

Replicating the model results

InferSent

Refer to the README file in the InferSent directory for instructions on how to install and run the InferSent models.

RoBERTa/DeBERTa

Clone our fork of the anli repo:

git clone https://github.com/sebschu/anli.git

Set up the ANLI models by following the instructions in ("Start your NLI Research")[https://github.com/sebschu/anli/blob/main/mds/start_your_nli_research.md].

Train the models:

To train the RoBERTa-large model on SNLI,MNLI,ANLI, and FEVER run:

export MASTER_PORT=88888
export MASTER_ADDR=localhost

# setup conda environment
source setup.sh


python src/nli/training.py \
  --model_class_name 'roberta-large' \
  --single_gpu \
  -n 1 \
  --seed 32423 \
  -g 1 \
  -nr 0 \
  --fp16 \
  --fp16_opt_level O2 \
  --max_length 156 \
  --gradient_accumulation_steps 1 \
  --per_gpu_train_batch_size 16 \
  --per_gpu_eval_batch_size 32 \
  --save_prediction \
  --train_data \
snli_train:none,mnli_train:none,fever_train:none,anli_r1_train:none,anli_r2_train:none,anli_r3_train:none \
  --train_weights \
1,1,1,10,20,10 \
  --eval_data \
snli_dev:none,mnli_m_dev:none,mnli_mm_dev:none,anli_r1_dev:none,anli_r2_dev:none,anli_r3_dev:none \
  --eval_frequency 2000 \
  --experiment_name 'roberta-large|snli+mnli+fnli+r1*10+r2*20+r3*10|nli'

To train the DeBERTa-v2-XLarge model on SNLI,MNLI,ANLI, and FEVER run:

export MASTER_PORT=88888
export MASTER_ADDR=localhost

# setup conda environment
source setup.sh


python src/nli/training.py \
  --model_class_name deberta \
  -n 1 \
  --seed 32423 \
  -g 2 \
  -nr 0 \
  --warmup_steps 1000 \
  --learning_rate 3e-6 \
  --fp16 \
  --fp16_opt_level O2 \
  --max_length 156 \
  --gradient_accumulation_steps 1 \
  --per_gpu_train_batch_size 16 \
  --per_gpu_eval_batch_size 32 \
  --save_prediction \
  --train_data \
snli_train:none,mnli_train:none,fever_train:none,anli_r1_train:none,anli_r2_train:none,anli_r3_train:none \
  --train_weights \
1,1,1,10,20,10 \
  --eval_data \
snli_dev:none,mnli_m_dev:none,mnli_mm_dev:none,anli_r1_dev:none,anli_r2_dev:none,anli_r3_dev:none \
  --eval_frequency 2000 \
  --experiment_name 'deberta-v2-xlarge|snli+mnli+fnli+r1*10+r2*20+r3*10|nli'

Evaluate the models

RoBERTa:

export MASTER_PORT=88888
export MASTER_ADDR=localhost

# setup conda environment
source setup.sh


python src/nli/evaluation.py \
    --model_class_name 'roberta-large' \
    --max_length 156 \
    --per_gpu_eval_batch_size 16 \
    --model_checkpoint_path \
    <PATH_TO_MODEL_FROM_STEP3>/model.pt \
    --eval_data \
    nope_main:<PATH_TO_NOPE_CORPUS>/nli_corpus.main.jsonl,nope_adv:<PATH_TO_NOPE_CORPUS>/nli_corpus.adv.jsonl \
    --output_prediction_path <OUTPUT_PATH>/predictions/test_nope/

DeBERTa:

export MASTER_PORT=88888
export MASTER_ADDR=localhost

# setup conda environment
source setup.sh


python src/nli/evaluation.py \
    --model_class_name 'deberta' \
    --max_length 156 \
    --per_gpu_eval_batch_size 16 \
    --model_checkpoint_path \
    <PATH_TO_MODEL_FROM_STEP3>/model.pt \
    --eval_data \
    nope_main:<PATH_TO_NOPE_CORPUS>/nli_corpus.main.jsonl,nope_adv:<PATH_TO_NOPE_CORPUS>/nli_corpus.adv.jsonl \
    --output_prediction_path <OUTPUT_PATH>/predictions/test_nope/

Citing

If you are using the NOPE corpus, please cite the following paper:

@inproceedings{NOPE,
  title="{NOPE}: {A} Corpus of Naturally-Occurring Presuppositions in {E}nglish",
  author={Parrish, Alicia, and Schuster, Sebastian and Warstadt, Alex and Agha, Omar and Lee, Soo-Hwan and Zhao, Zhuoye and Bowman, Samuel R. and Linzen, Tal},
  booktitle={Proceedings of the 25th Conference on Computational Natural Language Learning (CoNLL)},
  year={2021}
}

Name		Name	Last commit message	Last commit date
Latest commit History 256 Commits
InferSent		InferSent
analysis		analysis
annotated_corpus		annotated_corpus
archive		archive
experiments		experiments
extraction_pipeline		extraction_pipeline
human_results		human_results
model_results		model_results
paper/conll2021		paper/conll2021
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NOPE: A Corpus of Naturally-Occurring Presuppositions in English

Downloading the corpus

Replicating the model results

InferSent

RoBERTa/DeBERTa

Citing

About

Contributors 6

Languages

nyu-mll/nope

Folders and files

Latest commit

History

Repository files navigation

NOPE: A Corpus of Naturally-Occurring Presuppositions in English

Downloading the corpus

Replicating the model results

InferSent

RoBERTa/DeBERTa

Citing

About

Resources

Stars

Watchers

Forks

Contributors 6

Languages