Skip to content

Code for the paper "Improving Generalization in Coreference Resolution via Adversarial Training"

License

Notifications You must be signed in to change notification settings

sanjayss34/adv_coref

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Improving Generalization in Coreference Resolution via Adversarial Training

This repository contains the code for reproducing the experiments in the paper "Improving Generalization in Coreference Resolution via Adversarial Training" by Sanjay Subramanian and Dan Roth, published at *SEM 2019.

Requirements

This code was tested using Python 2.7 and Ubuntu 16.04. The requirements.txt co ntains the packages and corresponding versions of the Python environment used fo r running this code. Please follow the Getting Started instructions in https://github.com/kentonl/e2e-coref to download necessary files (e.g. word embeddings). You will also need to download the chec kpoint for the Lee et al. 2018 model and insert the corresponding path in the lee2018_log_root field in experiments_adv.conf. git-lfs was used to store the adv_checkpoint.zip file, so you may need git-lfs to clone the repository.

Modify paths

Make sure to set the paths in experiments_adv.conf and replace_data.py to be correct for your system. The allCountries.txt and countryInfo.txt files can be downloaded from geonames.org, and the last_names.txt file contains the last names from the 1990 census, which can be downloaded from https://www2.census.gov/topics/genealogy/1990surnames/dist.all.last#.

Reproducing Paper Results

First, unzip the adv_checkpoint.zip file to yield the adv_checkpoint directory. To reproduce the results in the paper, please run prepare_data.sh and subsequently run run_experiments.sh when the repository is the working directory. Please note that by default the prepare_data.sh script loads the state of the random number generator that we used to generate replacement names to enable exact reproducibility of our results. If you would like generate replacement names at random, you need only comment out the relevant line in generate_noleakage.py. The results should match those in the paper: http://cogcomp.org/papers/SubramanianRo19.pdf .

Acknowledgements

Much of the code in this repository is from Kenton Lee's repository https://github.com/kentonl/e2e-coref or is adapted from code in that repository. That code was distributed under an Apache 2.0 license. The firstname-gender-score.txt gazetteer was provided by Sihao Chen.

Citation

If you use this work in your research, please cite our paper:

@inproceedings{SubramanianRo19,
    author = {Sanjay Subramanian and Dan Roth},
    title = {{Improving Generalization in Coreference Resolution via Adversarial Training}},
    booktitle = {Proc. of the Joint Conference on Lexical and Computational Sematics},
    month = {6},
    year = {2019},
    url = "http://cogcomp.org/papers/SubramanianRo19.pdf",
}

About

Code for the paper "Improving Generalization in Coreference Resolution via Adversarial Training"

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published