Skip to content

Latest commit

 

History

History
48 lines (41 loc) · 4 KB

README.md

File metadata and controls

48 lines (41 loc) · 4 KB

Absinth: Hallucination Detection Dataset of German News Summaries

News: We have extended the Absinth dataset with more fine-grained span annotations for hallucinated instances. The annotations can be found here.

image

Absinth is a human-annotated dataset for faithfulness detection in the context of German news summarization. The dataset has 4335 instances in total, where each instance consists of the following elements:

  1. News Article: The original news article from the 20Minuten dataset.
  2. Summary-Sentence: A machine-generated summary-sentence of the news article. The sentence is generated by one of the following language models:
    • mBART : multilingual BART fine-tuned on 20Minuten.
    • mLongT5: multilingual LongT5 fine-tuned on 20Minuten.
    • Gpt4: zero-shot summary by Gpt4.
    • Gpt4-Intrinsic: zero-shot summary containing synthetic Intrinsic Hallucinations by Gpt4.
    • Gpt4-Extrinsic: zero-shot summary containing synthetic Extrinsic Hallucination by Gpt4.
    • Stable-Beluga-2: zero-shot summary by StableBeluga2, a Llama2-70B model fine-tuned on an Orca style Dataset.
    • Llama2-7B: base Llama2-7B model fine-tuned on 20Minuten using QLora.
  3. Label: The label categorizes the relationship between the news article and the summary-sentence. The label can be one of the following three values:
    • Faithful: The information in the sentence is consistent with the news article, without contradicting or adding external information.
    • Intrinsic Hallucination: The sentence contradicts the information in the article.
    • Extrinsic Hallucination: The sentence contains information not present in the article.

For more information about the creation of the dataset, please refer to our paper...

Download Dataset

  • The absinth dataset without the source articles can be accessed on Hugging Face.
  • The source articles can be downloaded from here.
  • To merge the source articles with the dataset, merge both files on the article_id column.

Reference

When using the Absinth dataset, please cite:

@inproceedings{mascarell-etal-2024-german,
    title = "German also Hallucinates! Inconsistency Detection in News Summaries with the Absinth Dataset",
    author = "Mascarell, Laura and
      Chalummattu, Ribin and
      Rios, Annette",
    booktitle = "Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING)",
    month = May,
    year = "2024",
    address = "Turin, Italy",
    publisher = "",
    url = "",
    pages = "",
    abstract = "The advent of Large Language Models (LLMs) has lead to remarkable progress on a wide range of natural language processing tasks. Despite the advances, these large-sized models still suffer from hallucinating information in their output, which poses a major issue in automatic text summarization, as we must guarantee that the generated summary is consistent with the content of the source document. Previous research addresses the challenging task of detecting hallucinations in the output (i.e. inconsistency detection) in order to evaluate the faithfulness of the generated summaries. However, these works primarily focus on English and recent multilingual approaches lack German data. This work presents absinth, a manually annotated dataset for hallucination detection in German news summarization and explores the capabilities of novel open-source LLMs on this task in both fine-tuning and in-context learning settings. We open-source and release the absinth dataset to foster further research on hallucination detection in German.",
}