multitask-prompting

Overview

Prompting uses the capability of large language models (LLM's) to "fill in the blank" in order to classify the meaning of text. I conducted research on how a single model can use prompting to simultaneously learn multiple tasks. RoBERTa was the primary model I experimented with. Here are the results. This repository contains the code I used to train and evaluate models during my experiments.

Here is a diagram from this paper that explains the difference between prompting and head-based fine-tuning for language models. The paper How Many Data Points is a Prompt Worth? performs some cool experiments showing the power of prompts in low resource settings.

RobertaPrompt

RobertaPrompt wraps around HuggingFace's RobertaForMaskedLM class and allows developers to train and test a Roberta model using prompting based on a prompt definition

Suppose we have two tasks: given an argument and a topic, we must detect if the argument was in support or against the topic, and also whether or not the argument contained a fallacy (if it does, what exact fallacy the argument contains).

A prompt definition would contain:

A template for each task. A template is a consistent text pattern associated with a task so the model recognizes which task needs to be completed. For example,

"Stance detection task. Topic: {insert topic here} and Argument: {insert argument here}. The stance is: <mask>"

A policy function for each task. The policy function maps the token that the model uses to fill in the blank with the predicted label.

My experiments trained a Roberta model to accomplish the exact tasks mentioned above - you can take a look at some example predictions in the prompting_example.ipynb notebook

Training and Testing

First, load a base model. A GPU as the device is highly reccomended.

pmodel = RobertaPrompt(model='roberta-large', device = torch.device('cuda'), prompt = argument_prompt)

Start training immediately by specifying the paths to a training and validation dataset. Training statistics will be displayed in stdout.

pmodel.train("sample_train_set.tsv", "sample_val_set.tsv", output_dir="sample_model", epochs=10)

After training is finished, evaluate the model on a test set using the following function and save the test results

pmodel.test("sample_test_set.tsv", save_path='stats.txt')

You should see text content in this format in the file specificed by save_path. Overall f1 scores are included, along with more fine-grained statistics on model performance for each label

One can then use this model and fine-tune it on other tasks with different prompts.

Data

Sample data for fallacious argument and stance detection is from Argotario.

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
data		data
.gitignore		.gitignore
README.md		README.md
prompt.py		prompt.py
prompting_example.ipynb		prompting_example.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

multitask-prompting

Overview

RobertaPrompt

Training and Testing

Data

About

Releases

Packages

Languages

lievan/multitask-prompting

Folders and files

Latest commit

History

Repository files navigation

multitask-prompting

Overview

RobertaPrompt

Training and Testing

Data

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages