GitHub - IIT-DM/Fin-Fact: A Benchmark Dataset for Multimodal Scientific Fact Checking

Fin-Fact - Multimodal Financial Fact-Checking Dataset

Overview

Welcome to the Fin-Fact repository! Fin-Fact is a comprehensive dataset designed specifically for financial fact-checking and explanation generation. This README provides an overview of the dataset, how to use it, and other relevant information. Click here to access the paper.

Dataset Description

Name: Fin-Fact
Purpose: Fact-checking and explanation generation in the financial domain.
Labels: The dataset includes various labels, including Claim, Author, Posted Date, Sci-digest, Justification, Evidence, Evidence href, Image href, Image Caption, Visualisation Bias Label, Issues, and Claim Label.
Size: The dataset consists of 3562 claims spanning multiple financial sectors.
Additional Features: The dataset goes beyond textual claims and incorporates visual elements, including images and their captions.

Dataset Usage

Fin-Fact is a valuable resource for researchers, data scientists, and fact-checkers in the financial domain. Here's how you can use it:

Download the Dataset: You can download the Fin-Fact dataset here or via the Hugging Face Hub. You can also load the dataset by using the following code:

from datasets import load_dataset
dataset = load_dataset("amanrangapur/Fin-Fact")

Exploratory Data Analysis: Perform exploratory data analysis to understand the dataset's structure, distribution, and any potential biases.
Natural Language Processing (NLP) Tasks: Utilize the dataset for various NLP tasks such as fact-checking, claim verification, and explanation generation.
Fact Checking Experiments: Train and evaluate machine learning models, including text and image analysis, using the dataset to enhance the accuracy of fact-checking systems.

Installation

Requires Python 3.9 to run.

Install conda environment from environment.yml file.

conda env create -n finfact --file environment.yml
conda activate finfact

Run models for paper metrics

We provide scripts let you easily run our dataset on existing state-of-the-art models and re-create the metrics published in paper. You should be able to reproduce our results from the paper by following these instructions. Please post an issue if you're unable to do this. To run existing ANLI models for fact checking.

Usage for LLM's:

Please create .env file and set your API key:

OPENAI_API_KEY="YOUR KEY"
GEMINI_API_KEY="YOUR KEY"

To run MLLM experiments:

python scripts/models/experiments.py --model ['llava/gpt-4/gemini'] --prompt_type ['open_book/closed_book/cot/symbolic/self_help']

Usage for Language Models:

BART

python scripts/models/anli.py --model_name 'ynie/bart-large-snli_mnli_fever_anli_R1_R2_R3-nli' --data_file finfact.json --threshold 0.5

RoBERTa

python scripts/models/anli.py --model_name 'ynie/roberta-large-snli_mnli_fever_anli_R1_R2_R3-nli' --data_file finfact.json --threshold 0.5

ELECTRA

python scripts/models/anli.py --model_name 'ynie/electra-large-discriminator-snli_mnli_fever_anli_R1_R2_R3-nli' --data_file finfact.json --threshold 0.5

AlBERT

python scripts/models/anli.py --model_name 'ynie/albert-xxlarge-v2-snli_mnli_fever_anli_R1_R2_R3-nli' --data_file finfact.json --threshold 0.5

XLNET

python scripts/models/anli.py --model_name 'ynie/xlnet-large-cased-snli_mnli_fever_anli_R1_R2_R3-nli' --data_file finfact.json --threshold 0.5

GPT-2

python gpt2_nli.py --model_name 'fractalego/fact-checking' --data_file finfact.json

Contribution

We welcome contributions from the community to help improve Fin-Fact. If you have suggestions, bug reports, or want to contribute code or data, please check our CONTRIBUTING.md file for guidelines.

License

Fin-Fact is released under the MIT License. Please review the license before using the dataset.

Contact

For questions, feedback, or inquiries related to Fin-Fact, please contact arangapur@hawk.iit.edu.

We hope you find Fin-Fact valuable for your research and fact-checking endeavors. Happy fact-checking!

Name		Name	Last commit message	Last commit date
Latest commit History 159 Commits
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml
finfact.json		finfact.json
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Fin-Fact - Multimodal Financial Fact-Checking Dataset

Table of Contents

Overview

Dataset Description

Dataset Usage

Installation

Run models for paper metrics

Usage for LLM's:

Usage for Language Models:

Contribution

License

Contact

About

Releases

Packages

Contributors 3

Languages

License

IIT-DM/Fin-Fact

Folders and files

Latest commit

History

Repository files navigation

Fin-Fact - Multimodal Financial Fact-Checking Dataset

Table of Contents

Overview

Dataset Description

Dataset Usage

Installation

Run models for paper metrics

Usage for LLM's:

Usage for Language Models:

Contribution

License

Contact

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages