GSM8K-Consistency is a benchmark database for analyzing the consistency of Arithmetic Reasoning on GSM8K.
-
Updated
Dec 31, 2023
GSM8K-Consistency is a benchmark database for analyzing the consistency of Arithmetic Reasoning on GSM8K.
[ACL 2024 Findings] The official repo for "ConceptMath: A Bilingual Concept-wise Benchmark for Measuring Mathematical Reasoning of Large Language Models".
Fuzzy reasoning of Generalized Quantifiers
Code and Data Repo for NeurIPS 2024 Paper "Embedding Trajectory for Out-of-Distribution Detection in Mathematical Reasoning"
[EMNLP 2023, Findings] GRACE: Discriminator-Guided Chain-of-Thought Reasoning
MathPrompter Implementation: This repository hosts an implementation based on the 'MathPrompter: Mathematical Reasoning Using Large Language Models' paper by Microsoft Research. The code replicates the methods discussed in the paper.
[NeurIPS 2024] Code for the paper "Diffusion of Thoughts: Chain-of-Thought Reasoning in Diffusion Language Models"
[ACL'24] Code and data of paper "When is Tree Search Useful for LLM Planning? It Depends on the Discriminator"
Small and Efficient Mathematical Reasoning LLMs
The lecture notes for my discrete mathematics classes.
Resources of deep learning for mathematical reasoning (DL4MATH).
ToRA is a series of Tool-integrated Reasoning LLM Agents designed to solve challenging mathematical reasoning problems by interacting with tools [ICLR'24].
Add a description, image, and links to the mathematical-reasoning topic page so that developers can more easily learn about it.
To associate your repository with the mathematical-reasoning topic, visit your repo's landing page and select "manage topics."