CiteME is a benchmark designed to test the abilities of language models in finding papers that are cited in scientific texts.
The hand curated version of the dataset can be found on citeme.ai.
It contains following columns:
id
: A unique id that is used in all our experiments to reference a specific paper.excerpt
: The text excerpt describing the target paper.target_paper_title
: The title of the paper described by the excerpt.target_paper_url
: The URL to the paper described by the excerpt.source_paper_title
: The title of the paper the excerpt was taken from.source_paper_url
: The URL to the paper the excerpt was taken from.year
: The year the source paper was published.split
: Indicates if the sample is from thetrain
ortest
split.
CiteAgent requires following environment variables to function properly:
S2_API_KEY
: Your semantic scholar api keyOPENAI_API_KEY
: Your openai api key (for gpt-4 models)ANTHROPIC_API_KEY
: Your anthropic api key (for claude models)TOGETHER_API_KEY
: Your together api key (for llama models)
-
Install the required python packages listed in the
requirements.txt
.pip install -r requirements.txt
-
Download the dataset from citeme.ai and place it in the project folder as
DATASET.csv
. -
Run the
main.py
file.python src/main.py
To modify the run parameters open src/main.py
and update the metadata
dict.
To run different models adjust the model
entry (e.g. gpt-4o
, claude-3-opus-20240229
or meta-llama/Llama-3-70b-chat-hf
).
To run the agent without actions change the executor from LLMSelfAskAgentPydantic
to LLMNoSearch
and adjust the prompt_name
to a *_no_search
prompt.
@inproceedings{press2024citeme,
title={Cite{ME}: Can Language Models Accurately Cite Scientific Claims?},
author={Press, Ori and Hochlehnert, Andreas and Prabhu, Ameya and Udandarao, Vishaal and Press, Ofir and Bethge, Matthias},
booktitle={The Thirty-eight Conference on Neural Information Processing Systems Datasets and Benchmarks Track},
year={2024}
}
Code: MIT. Check LICENSE
.
Dataset: CC-BY-4.0. Check LICENSE_DATASET
.