Eleanor-H/MUSTARD
Code & data for ICLR 2024 spotlight paper: 🍯MUSTARD: Mastering Uniform Synthesis of Theorem and Proof Data
MUSTARD helps mathematicians, educators, and AI researchers generate diverse datasets of mathematical problems and proofs. It takes user-defined parameters for problem type, level, and keywords, and outputs structured data containing informal and formal (Lean) versions of theorems and proofs. This is ideal for training AI models that can solve math word problems or perform automated theorem proving.
No commits in the last 6 months.
Use this if you need to create a large, varied dataset of mathematical theorems and proofs for training or evaluating AI models in mathematics education or automated reasoning.
Not ideal if you're looking for an interactive theorem prover or a tool to help you personally solve mathematical problems rather than generate data for AI.
Stars
42
Forks
2
Language
C++
License
—
Category
Last pushed
May 29, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/Eleanor-H/MUSTARD"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
ExtensityAI/symbolicai
A neurosymbolic perspective on LLMs
TIGER-AI-Lab/MMLU-Pro
The code and data for "MMLU-Pro: A More Robust and Challenging Multi-Task Language Understanding...
deep-symbolic-mathematics/LLM-SR
[ICLR 2025 Oral] This is the official repo for the paper "LLM-SR" on Scientific Equation...
microsoft/interwhen
A framework for verifiable reasoning with language models.
zhudotexe/fanoutqa
Companion code for FanOutQA: Multi-Hop, Multi-Document Question Answering for Large Language...