Skip to content

sedrickkeh/11711-prefix-tuning

Repository files navigation

11-711 Prefix Tuning Reproduction

This repo contains the code we used to reproduce the paper Prefix-Tuning: Optimizng Continuous Prompts for Generation.

The main backbone of this code is the Prefix-Tuning repo published by the authors of the original paper. We cloned this and created our own version here.

Some changes we made included the following:

  • adjust some file paths
  • add processed datasets for low-data experiment settings

Table of Contents

  1. Quickstart
  2. Table-to-text Results
  3. Summarization Results
  4. Low-data Settings
  5. Ablation Studies

0. Quickstart

We recommend looking at the the following notebooks, which cover all the steps, including cloning repos, downloading dependencies, running the training scripts, and doing evaluation.


Alternatively, below are instructions to reproduce:

  1. Clone the following repos:
$ git clone https://github.com/sedrickkeh/PrefixTuning.git
$ git clone https://github.com/sedrickkeh/dart.git
$ git clone https://github.com/tuetschek/e2e-metrics.git
  1. Navigate into the transformers folder in PrefixTuning and install the following dependencies:
cd PrefixTuning/transformers
pip install -e .
pip install git+https://github.com/PyTorchLightning/pytorch-lightning
pip install gitpython
pip install rouge_score
pip install sacrebleu
pip install unidecode
  1. Run experiment code (example below):
python train_e2e.py --preseqlen 5 --learning_rate 0.00008 --seed 88 --epoch 5
  1. Run evaluation code
bash ./dart/evaluation/run_eval_on_webnlg.sh

For 3. and 4. above, note that you may need to modify some of the file paths.

1. Table-to-text Results

There are three datasets here, namely E2E, WebNLG, and DART.

1. E2E

The script below automatically does evaluation after it trains the model.

python train_e2e.py --preseqlen 5 --learning_rate 0.00007 --seed 22 --epoch 5 --notes earlystop

2. WebNLG

Training

python train_e2e.py --mode webnlg --preseqlen 5 --learning_rate 0.00005 --bsz 5 --seed 222 --epoch 5 --notes earlystop

Evaluation

bash ./dart/evaluation/run_eval_on_webnlg.sh

3. DART

Training

python train_e2e.py --mode triples --preseqlen 20 --seed 9 --bsz 5 --epoch 5 --learning_rate 0.00008

Evaluation is quite long and may require installing some libraries. Please refer to the DART notebook.

2. Summarization Results

cd seq2seq

python train_bart.py --mode xsum --preseqlen 200 --do_train yes --fp16 yes --bsz 2 --epoch 15 --gradient_accumulation_step 3 --learning_rate 0.00005 --mid_dim 800

3. Low-data Settings

Before running these experiments, first construct the low-data datasets using this script.

Dataset size 50

python train_e2e.py --preseqlen 5 --learning_rate 8e-5 --seed 88 --bsz 10 --lowdata_token 'table-to-text-restaurant:' --epoch 100 --warmup_steps 300 --notes earlystoplowdata_88_50

Dataset size 100

python train_e2e.py --preseqlen 5 --learning_rate 7e-5 --seed 88 --bsz 10 --lowdata_token 'table-to-text-restaurant:' --epoch 100 --warmup_steps 100 --notes earlystoplowdata_88_100

Experiments were done for dataset sizes 50, 100, 200, and 500. Scripts for dataset sizes 200 and 500 are analogous to the ones above. Exact hyperparameters can be found in the appendix of our submitted report.

4. Ablation Studies

We conduct two ablation studies:

  1. Prefix Length
    This builds on the experiments for DART.
python train_e2e.py --mode triples --preseqlen (prefix_length) --seed 9 --bsz 5 --epoch 5 --learning_rate 8e-5
  1. Prefix Initialization
    This builds on the experiments for low-data E2E settings.
python train_e2e.py --preseqlen 5 --lowdata_token (insert_initialization_here) --learning_rate 7e-5 --seed 88 --bsz 10 --epoch 100 --warmup_steps 100 --notes earlystoplowdata_88_500

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published