Towards Reasoning in Large Language Models via Multi-Agent Peer Review Collaboration

✨ Overview

This repository contains official implementation of our paper Towards Reasoning in Large Language Models via Multi-Agent Peer Review Collaboration.

We introduce a multi-agent collaboration strategy that emulates the academic peer review process. Each agent independently constructs its own solution, provides reviews on the solutions of others, and assigns confidence levels to its reviews. Upon receiving peer reviews, agents revise their initial solutions.

Extensive experiments on three different types of reasoning tasks show that our collaboration approach delivers superior accuracy across all ten datasets compared to existing methods.

If you have any question, please feel free to contact us by e-mail: xuzhenran.hitsz@gmail.com or submit your issue in the repository.

🔥 News

[Nov 14, 2023] We release the codes and the results of our method.

🚀 Example

🚨 Usage

Environment

conda create -n MAPR python=3.9
conda activate MAPR
pip install -r requirements.txt

Run

Take GSM8K dataset as an example.

1. Peer Review

python peer_review.py --task GSM8K --openai_key YOUR_KEY --openai_organization YOUR_ORG

2. Debate

python debate.py --task GSM8K --openai_key YOUR_KEY --openai_organization YOUR_ORG

3. Peer Review w/o Confidence

python feedback.py --task GSM8K --openai_key YOUR_KEY --openai_organization YOUR_ORG

4. Self-correction

python self_correction.py --task GSM8K --openai_key YOUR_KEY --openai_organization YOUR_ORG

5. Majority and Zero-shot CoT

python single_agent.py --task GSM8K --openai_key YOUR_KEY --openai_organization YOUR_ORG

Evaluate

Take GSM8K dataset as an example.

python eval.py --task GSM8K --method peer_review --time_flag 1113

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Towards Reasoning in Large Language Models via Multi-Agent Peer Review Collaboration

✨ Overview

🔥 News

🚀 Example

🚨 Usage

Environment

Run

1. Peer Review

2. Debate

3. Peer Review w/o Confidence

4. Self-correction

5. Majority and Zero-shot CoT

Evaluate

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
datasets		datasets
processed_data		processed_data
result		result
README.md		README.md
data_proc.py		data_proc.py
debate.py		debate.py
eval.py		eval.py
feedback.py		feedback.py
overview.png		overview.png
params.py		params.py
peer_review.py		peer_review.py
self_correction.py		self_correction.py
single_agent.py		single_agent.py

HITsz-TMG/Multi-agent-peer-review

Folders and files

Latest commit

History

Repository files navigation

Towards Reasoning in Large Language Models via Multi-Agent Peer Review Collaboration

✨ Overview

🔥 News

🚀 Example

🚨 Usage

Environment

Run

1. Peer Review

2. Debate

3. Peer Review w/o Confidence

4. Self-correction

5. Majority and Zero-shot CoT

Evaluate

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages