Skip to content

Latest commit

 

History

History
61 lines (38 loc) · 2.59 KB

README.md

File metadata and controls

61 lines (38 loc) · 2.59 KB

Discovering Human Values in Arguments

This Repo contains the best performing system (Adam-Smith) of SemEval2023 Task 4 - ValueEval: Identification of Human Values behind Arguments

Link to competition: https://touche.webis.de/semeval23/touche23-web/index.html

Link to WebDemo: https://values.args.me/

Linkt to System Description Paper: https://arxiv.org/abs/2305.08625

Link to HuggingfaceModel: https://huggingface.co/tum-nlp/Deberta_Human_Value_Detector

Link to Docker Container: https://github.com/touche-webis-de/team-adam-smith23

Alt text

1. Set Up Project

If you want the best performing single model, you might want to consider for ease-of-use the huggingface version: https://huggingface.co/tum-nlp/Deberta_Human_Value_Detector

1.1 Install dependencies

Create a conda environment with python 3.10 and install the required packages.

pip install -r requirements.txt

1.2 Get Data

Get Data from Competition and place it in data directory: https://zenodo.org/record/7550385

1.3 Get Models

The trained models can be downloaded under the following link: https://zenodo.org/records/7656534 Place them in checkpoints directory. Download models and corresponding PARAMS Files. If you want to train the models yourself, you find the instructions in the section (#Training)

2. Reproduce Competiton Results

The ensmebling_and_predict.ipynb notebook reproduces the competiton results. In order to run it, you need to have the trained Models in place So Make sure you have the models downloaded and placed in the checkpoints folder together with their PARAM Files (See Get Models). This notebook is the foundation for the docker container published in the context of the data science competition.

3. Retrain from Scratch

Alt text

If you want to understand the training-process and retrain the models yourself. The process is split into three steps:

  1. Generate the DataSet and the Leave-Out-DataSet data_generation.ipynb
  2. Train the Model with the configurations from the paper train.ipynb
  3. Calculate the optimal Threshold of the Ensemble and Predict Final Submission File. (Also includes Stacking for Ensemble Variations) Ensemble_eval_and_predict.ipynb