conll2003-bert

Applying BERT neural net to CoNLL2003 NER task

Setting things up

Create Python3 virtual envirnoment and install dependencies

python3 -m venv .venv
. .venv/bin/activate
pip install -r requirements.txt

Download CONLL2003 dataset

python -m cbert.download_conll2003

Download Glove (optional)

Skip this if you not interested in trying models with Glove.

python -m cbert.download_glove

Run training

Glove + simple one-layer bidi LSTM
```
python -m cbert.train traindir-A
```
This will create directory traindir. Checkpoints, tensorboard stats, etc will be saved there
BERT as an embedding (frozen) + learning a simple one-layer bidi LSTM on top
```
python -m cbert.train01 traindir-B
```
BERT with a simple one-layer bidi LSTM on top (all layers are learned)
```
python -m cbert.train02 traindir-B
```
Glove embedings + ADW-LSTM from fastai (not working, lacking bidi support at the moment)
```
python -m cbert.train03 traindir-C
```
BERT tagger (nothing on top - just a dense layer). All layers are trained.
```
python -m cbert.train04 traindir-D
```

Tensorboard visualization

Training, validation, and test statistics is written to the train dir and can be viewed with tensorboard.

pip install tensorboard
tensorboard --logdir .

Experiments

Baseline bidi LSTM + Glove: dev_F1=92.7, test_F1=88.9
BERT as embedding:
BERT + bidi LSTM: dev_F1: 94.5, test_F1: 90.0
AWD LSTM + Glove:
BERT tagger: dev_F1: 95.6, test_F1: 91.0

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
cbert		cbert
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

conll2003-bert

Setting things up

Download CONLL2003 dataset

Download Glove (optional)

Run training

Tensorboard visualization

Experiments

About

Releases

Packages

Languages

License

mkroutikov/conll2003-bert

Folders and files

Latest commit

History

Repository files navigation

conll2003-bert

Setting things up

Download CONLL2003 dataset

Download Glove (optional)

Run training

Tensorboard visualization

Experiments

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages