eth-lre/mathtutorbench

Benchmark for Measuring Open-ended Pedagogical Capabilities of LLM Tutors, EMNLP 2025 Oral

38
/ 100
Emerging

This project provides a standardized way to test how well AI language models can act as math tutors. You input a specific math tutoring AI model, and it produces a detailed report on its performance across seven key teaching skills, such as problem-solving assistance or mistake correction. Educators, instructional designers, and AI developers building educational tools would use this to understand and improve their AI tutors.

Use this if you are developing or evaluating an AI model designed to tutor students in mathematics and need a comprehensive, automated way to assess its pedagogical effectiveness.

Not ideal if you are looking for a general-purpose AI evaluation tool or a benchmark for non-math related educational AI.

AI-in-education math-tutoring pedagogical-assessment language-model-evaluation instructional-AI
No License No Package No Dependents
Maintenance 6 / 25
Adoption 7 / 25
Maturity 8 / 25
Community 17 / 25

How are scores calculated?

Stars

32

Forks

10

Language

Python

License

Last pushed

Nov 18, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/eth-lre/mathtutorbench"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.