sail-sg/understand-r1-zero
Understanding R1-Zero-Like Training: A Critical Perspective
This project offers a deep dive into optimizing the training of large language models for mathematical reasoning tasks. It takes base models, like those from Qwen or Llama, and applies reinforcement learning techniques to significantly improve their ability to solve complex math problems. Scientists and machine learning researchers working on advanced AI for mathematical problem-solving would find this useful for developing more capable models.
1,224 stars. No commits in the last 6 months.
Use this if you are a machine learning researcher focused on improving the mathematical reasoning capabilities of large language models through advanced training techniques like reinforcement learning.
Not ideal if you are a general user looking for a pre-trained model for everyday math problems or if you lack a deep understanding of LLM training and reinforcement learning concepts.
Stars
1,224
Forks
57
Language
Python
License
MIT
Category
Last pushed
Aug 27, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/sail-sg/understand-r1-zero"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
cvs-health/uqlm
UQLM: Uncertainty Quantification for Language Models, is a Python package for UQ-based LLM...
PRIME-RL/TTRL
[NeurIPS 2025] TTRL: Test-Time Reinforcement Learning
sapientinc/HRM
Hierarchical Reasoning Model Official Release
tigerchen52/query_level_uncertainty
query-level uncertainty in LLMs
reasoning-survey/Awesome-Reasoning-Foundation-Models
✨✨Latest Papers and Benchmarks in Reasoning with Foundation Models