Zero-shot Faithfulness Evaluation for Text Summarization with Foundation Language Model

This paper has been accepted by EMNLP2023.

Requirements

python==3.7
pytorch==1.11.0
transformers==4.28.1
scipy==1.7.3
scikit-learn==1.0.2
numpy==1.21.5

Prepare datasets

Download the benchmark datasets and put them under the directory ./data. Modify corresponding paths in load_dataset.py if necessary.

Setting	Dataset	Val	Test	Source	Link
Inconsistency Detection (SUMMAC Benchmark)	CoGenSum	1281	400	C	https://github.com/tingofurro/summac
	SummEval	850	850	C
	FRANK	671	1575	C+X
	Polytope	634	634	C
	FactCC	931	503	C
	XSumFaith	1250	1250	C
Faithfulness Rating	FRANKCNN	-	1250	C	https://github.com/NJUNLP/CoP
	QAGSCNN	-	235	C	https://github.com/NJUNLP/CoP
	SummEval	-	1600	C	https://github.com/Yale-LILY/SummEval
	FRANKXSUM	-	996	X	https://github.com/NJUNLP/CoP
	QAGSXSUM	-	239	X	https://github.com/NJUNLP/CoP

Probability Caculation

Calculate the probabilities based on a foundation language model by:

CUDA_VISIBLE_DEVICES=0 python3 main.py

The results will be saved under the directory ./output, or can be downloaded with this link.

FFLM

Then, the summary-level and system-level performances of FFLM can be calculated as follows:

python3 summary-level-evaluation.py --file_path xxx
python3 system-level-evaluation.py --file_path xxx

Citation

@article{jia2023fflm,
  title={Zero-shot Faithfulness Evaluation for Text Summarization with Foundation Language Model},
  author={Qi Jia, Siyu Ren, Yizhu Liu, Kenny Q. Zhu},
  jbooktitle={Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing},
  year={2023}
}

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
data		data
output		output
scorers		scorers
README.md		README.md
intersection-classification-evaluation.py		intersection-classification-evaluation.py
load_dataset.py		load_dataset.py
main.py		main.py
multi-step-classification-evaluation.py		multi-step-classification-evaluation.py
my_main.py		my_main.py
summary-level-evaluation.py		summary-level-evaluation.py
system-level-evaluation.py		system-level-evaluation.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Zero-shot Faithfulness Evaluation for Text Summarization with Foundation Language Model

Requirements

Prepare datasets

Probability Caculation

FFLM

Citation

About

Releases

Packages

Languages

RibinMTC/FaithEval-FFLM

Folders and files

Latest commit

History

Repository files navigation

Zero-shot Faithfulness Evaluation for Text Summarization with Foundation Language Model

Requirements

Prepare datasets

Probability Caculation

FFLM

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages