Learning Sentence Representations From Natural Language Inference Data

This repository contains the code for the paper Supervised Learning of Universal Sentence Representations from Natural Language Inference Data for the course Advanced Topics in Computational Semantics at the University of Amsterdam.

Structure

The repository is structured as follows:

data/ contains the scripts to download the data for the experiments and SentEval. After training the vocabulary and embeddings are stored here as well.
logs/ contains the Lisa logs from the training.
models/ contains the pre-trained models.
runs/ contains the Tensorboard logs. The logs are stored in a directory with the name of the model.
src/ contains the source code of the project.
results.ipynb contains the prediction code, results of the experiments, and discussion.
requirements.txt contains the requirements for the project.
README.md contains the instructions for the project.
pyproject.toml contains the project configuration.

Requirements

The code is written in Python 3.10. The requirements can be installed using pip install -r requirements.txt or with the conda environment file conda env create -f environment.yml.

Datasets

The The Stanford Natural Language Inference (SNLI) will be downloaded automatically when running the training script. The SentEval datasets can be downloaded using the following command from the data/downstream directory:

bash ./get_transfer_data.bash

Usage

You can train a model using the following command:

python src/train.py --encoder <encoder>

You can evaluate a model using the following command:

python src/eval.py --checkpoint <checkpoint> --encoder <encoder> --eval --senteval

The --eval flag will evaluate the model on the SNLI dataset. The --senteval flag will evaluate the model on the SentEval datasets. See -h for more options such as changing the batch size or the number of epochs.

Models

The pre-trained models can be downloaded from here. The models should be placed in the models/ directory.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Learning Sentence Representations From Natural Language Inference Data

Structure

Requirements

Datasets

Usage

Models

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
data/downstream		data/downstream
logs		logs
models		models
runs		runs
src		src
.gitignore		.gitignore
README.md		README.md
environment.yaml		environment.yaml
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
results.ipynb		results.ipynb

SebastiaanJohn/nli-sentence-reps

Folders and files

Latest commit

History

Repository files navigation

Learning Sentence Representations From Natural Language Inference Data

Structure

Requirements

Datasets

Usage

Models

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages