naivoder/MCTSr
Monte Carlo Tree Search Self-Refine (MCTSr)
This project helps AI researchers and developers evaluate the problem-solving abilities of large language models (LLMs). By feeding mathematical word problems or complex math equations into a local LLaMA instance, it systematically tests how well the model generates correct answers and refines its reasoning. The output provides insights into the LLM's performance on these challenging datasets.
No commits in the last 6 months.
Use this if you are an AI researcher or developer focused on understanding and improving the mathematical reasoning capabilities of LLMs.
Not ideal if you are looking for a fully polished, production-ready tool for general LLM evaluation without deep technical engagement.
Stars
22
Forks
2
Language
Python
License
—
Category
Last pushed
Jul 06, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/naivoder/MCTSr"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
open-thought/reasoning-gym
[NeurIPS 2025 Spotlight] Reasoning Environments for Reinforcement Learning with Verifiable Rewards
Hmbown/Hegelion
Dialectical reasoning architecture for LLMs (Thesis → Antithesis → Synthesis)
LLM360/Reasoning360
A repo for open research on building large reasoning models
bowang-lab/BioReason
BioReason: Incentivizing Multimodal Biological Reasoning within a DNA-LLM Model | NeurIPS '25
TsinghuaC3I/Awesome-RL-for-LRMs
A Survey of Reinforcement Learning for Large Reasoning Models