The experiments were conducted using Python3.6 and PyTorch 1.2.0 installed on a server with a single or eight NVidia V100 GPUs. We used NVidia's PyTorch Docker container 19.02. For computational efficiency, we used mixed precision training based on APEX library which can be installed as follows:
git clone https://github.com/NVIDIA/apex.git
cd apex
git checkout c3fad1ad120b23055f6630da0b029c8b626db78f
pip install -v --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" .
The APEX library is not needed if you do not use --fp16
option or reproduce
the results based on the trained checkpoint files.
The commands that reproduce the experimental results are provided as follows:
To reproduce results based on this code, please download the model checkpoints from the links below.
Name | Base Model | Entity Vocab Size | Params | Download |
---|---|---|---|---|
LUKE-500K (base) | roberta.base | 500K | 253 M | Link |
LUKE-500K (large) | roberta.large | 500K | 484 M | Link |
Dataset: Link
Checkpoint file (compressed): Link
Prepare the dataset
gdown --id 1HlWw7Q6-dFSm9jNSCh4VaBf1PlGqt9im
tar xzf data.tar.gz
Using the checkpoint file:
python -m examples.cli \
--model-file=luke_large_500k.tar.gz \
--output-dir=<OUTPUT_DIR> \
entity-typing run \
--data-dir=<DATA_DIR> \
--checkpoint-file=<CHECKPOINT_FILE> \
--no-train
Fine-tuning the model:
python -m examples.cli \
--model-file=luke_large_500k.tar.gz \
--output-dir=<OUTPUT_DIR> \
entity-typing run \
--data-dir=<DATA_DIR> \
--train-batch-size=2 \
--gradient-accumulation-steps=2 \
--learning-rate=1e-5 \
--num-train-epochs=3 \
--fp16
Dataset: Link
Checkpoint file (compressed): Link
Using the checkpoint file:
python -m examples.cli \
--model-file=luke_large_500k.tar.gz \
--output-dir=<OUTPUT_DIR> \
relation-classification run \
--data-dir=<DATA_DIR> \
--checkpoint-file=<CHECKPOINT_FILE> \
--no-train
Fine-tuning the model:
python -m examples.cli \
--model-file=luke_large_500k.tar.gz \
--output-dir=<OUTPUT_DIR> \
relation-classification run \
--data-dir=<DATA_DIR> \
--train-batch-size=4 \
--gradient-accumulation-steps=8 \
--learning-rate=1e-5 \
--num-train-epochs=5 \
--fp16
Dataset: Link
Checkpoint file (compressed): Link
Using the checkpoint file:
python -m examples.cli \
--model-file=luke_large_500k.tar.gz \
--output-dir=<OUTPUT_DIR> \
ner run \
--data-dir=<DATA_DIR> \
--checkpoint-file=<CHECKPOINT_FILE> \
--no-train
Fine-tuning the model:
python -m examples.cli\
--model-file=luke_large_500k.tar.gz \
--output-dir=<OUTPUT_DIR> \
ner run \
--data-dir=<DATA_DIR> \
--train-batch-size=2 \
--gradient-accumulation-steps=4 \
--learning-rate=1e-5 \
--num-train-epochs=5 \
--fp16
Dataset: Link
Checkpoint file (compressed): Link
Using the checkpoint file:
python -m examples.cli \
--model-file=luke_large_500k.tar.gz \
--output-dir=<OUTPUT_DIR> \
entity-span-qa run \
--data-dir=<DATA_DIR> \
--checkpoint-file=<CHECKPOINT_FILE> \
--no-train
Fine-tuning the model:
python -m examples.cli \
--num-gpus=8 \
--model-file=luke_large_500k.tar.gz \
--output-dir=<OUTPUT_DIR> \
entity-span-qa run \
--data-dir=<DATA_DIR> \
--train-batch-size=1 \
--gradient-accumulation-steps=4 \
--learning-rate=1e-5 \
--num-train-epochs=2 \
--fp16
Dataset: Link
Checkpoint file (compressed): Link
Wikipedia data files (compressed):
Link
Using the checkpoint file:
python -m examples.cli \
--model-file=luke_large_500k.tar.gz \
--output-dir=<OUTPUT_DIR> \
reading-comprehension run \
--data-dir=<DATA_DIR> \
--checkpoint-file=<CHECKPOINT_FILE> \
--no-negative \
--wiki-link-db-file=enwiki_20160305.pkl \
--model-redirects-file=enwiki_20181220_redirects.pkl \
--link-redirects-file=enwiki_20160305_redirects.pkl \
--no-train
Fine-tuning the model:
python -m examples.cli \
--num-gpus=8 \
--model-file=luke_large_500k.tar.gz \
--output-dir=<OUTPUT_DIR> \
reading-comprehension run \
--data-dir=<DATA_DIR> \
--no-negative \
--wiki-link-db-file=enwiki_20160305.pkl \
--model-redirects-file=enwiki_20181220_redirects.pkl \
--link-redirects-file=enwiki_20160305_redirects.pkl \
--train-batch-size=2 \
--gradient-accumulation-steps=3 \
--learning-rate=15e-6 \
--num-train-epochs=2 \
--fp16