This Repo contains the best performing system (Adam-Smith) of SemEval2023 Task 4 - ValueEval: Identification of Human Values behind Arguments
Link to competition: https://touche.webis.de/semeval23/touche23-web/index.html
Link to WebDemo: https://values.args.me/
Linkt to System Description Paper: https://arxiv.org/abs/2305.08625
Link to HuggingfaceModel: https://huggingface.co/tum-nlp/Deberta_Human_Value_Detector
Link to Docker Container: https://github.com/touche-webis-de/team-adam-smith23
If you want the best performing single model, you might want to consider for ease-of-use the huggingface version: https://huggingface.co/tum-nlp/Deberta_Human_Value_Detector
Create a conda environment with python 3.10 and install the required packages.
pip install -r requirements.txt
Get Data from Competition and place it in data directory: https://zenodo.org/record/7550385
The trained models can be downloaded under the following link: https://zenodo.org/records/7656534 Place them in checkpoints directory. Download models and corresponding PARAMS Files. If you want to train the models yourself, you find the instructions in the section (#Training)
The ensmebling_and_predict.ipynb notebook reproduces the competiton results. In order to run it, you need to have the trained Models in place So Make sure you have the models downloaded and placed in the checkpoints folder together with their PARAM Files (See Get Models). This notebook is the foundation for the docker container published in the context of the data science competition.
If you want to understand the training-process and retrain the models yourself. The process is split into three steps:
- Generate the DataSet and the Leave-Out-DataSet data_generation.ipynb
- Train the Model with the configurations from the paper train.ipynb
- Calculate the optimal Threshold of the Ensemble and Predict Final Submission File. (Also includes Stacking for Ensemble Variations) Ensemble_eval_and_predict.ipynb