AI for Mathematics

Table of Contents

Reading List

Paper	Base Language Model	Code	Publication	Preprint	Affiliation
Solving olympiad geometry without human demonstrations	Transformer-style (151M)	AlphaGeometry	Nature	2401.blog	DeepMind
G-LLaVA: Solving Geometric Problem with Multi-Modal Large Language Model	LLaMA (LLaVA)	G-LLaVA		2312.11370	HUAWEI
ToRA: A Tool-Integrated Reasoning Agent for Mathematical Problem Solving	GPT4, LLaMA2, etc	ToRA		2309.17452	Microsoft

Datasets

MathPile, High-quality, large-scale corpora are the cornerstone of building powerful foundation models. In this work, we introduce MathPile a diverse and high-quality math-centric corpus comprising about 9.5 billion tokens.

Benchmarks

GSM8K | paper | blog, a dataset of 8.5K high quality linguistically diverse grade school math word problems, by OpenAI, 2021
MATH | paper (NeurIPS 2021), Hard mathematics problems, 12k problems within 7 categories, very hard math and natural science, by UCB, 2021
TheoremQA | paper, Hard mathematics problems, 800 QA pairs covering 350+ theorems spanning across Math, EE&CS, Physics and Finance, by UWaterloo, 2023