Table of Contents
Paper | Base Language Model | Code | Publication | Preprint | Affiliation |
---|---|---|---|---|---|
Solving olympiad geometry without human demonstrations | Transformer-style (151M) | AlphaGeometry | Nature | 2401.blog | DeepMind |
G-LLaVA: Solving Geometric Problem with Multi-Modal Large Language Model | LLaMA (LLaVA) | G-LLaVA | 2312.11370 | HUAWEI | |
ToRA: A Tool-Integrated Reasoning Agent for Mathematical Problem Solving | GPT4, LLaMA2, etc | ToRA | 2309.17452 | Microsoft |
- Datasets and Benchmarks
Datasets
- MathPile, High-quality, large-scale corpora are the cornerstone of building powerful foundation models. In this work, we introduce MathPile a diverse and high-quality math-centric corpus comprising about 9.5 billion tokens.
Benchmarks
- GSM8K | paper | blog, a dataset of 8.5K high quality linguistically diverse grade school math word problems, by OpenAI, 2021
- MATH | paper (NeurIPS 2021), Hard mathematics problems, 12k problems within 7 categories, very hard math and natural science, by UCB, 2021
- TheoremQA | paper, Hard mathematics problems, 800 QA pairs covering 350+ theorems spanning across Math, EE&CS, Physics and Finance, by UWaterloo, 2023